SAVOT - Simple Access to VOTable - Release 2.6

Author : André Schaaff - Read before any use of available Java classes
  1. SAVOT - Simple Access to VOTable - Release 2.6
    1. Introduction
    2. The Data model
    3. The Pull parsing mode (packages : cds.savot.pull, cds.savot.model and cds.savot.common)
    4. The SAXLike parsing mode (package : cds.savot.sax and cds.savot.common, cds.savot.model is optional)
    5. Quick start with SAVOT Pull !
      1. Example 1 : FULL mode
      2. Example 2 : SEQUENTIAL mode
    6. Quick start with SAVOT SAXLike !
      1. Example : SAVOT SAX
    7. FAQ
    8. Statistics (SAVOT Pull)
    9. Download and links
 
Important : you can send any idea or comment concerning SAVOT to question@simbad.u-strasbg.fr

Introduction

The main goals of this work are :

These different parsers are based on Pull and SAX like parsing methods :

The Data model

The data model has been created to be able to load a VOTable document into memory.
The data model is independent from the parsers.
This model is based on the VOTable following schema (Documentation) :


Click to enlarge

The Pull parsing mode (packages : cds.savot.pull, cds.savot.model and cds.savot.common)

SAVOT pull parsing can be implemented in two ways : FULL or SEQUENTIAL

            DOM parsers are often unable to load large XML document because they use too many memory.
            SAVOT has been designed to load very large document into the memory.
            In this mode it is possible to manipulate the whole data into memory through the internal data model API (cds.savot.model).
            After modifications, the VOTable document can be saved through the writer (cds.savot.writer).

            The memory needs will be limited to the size of the most important RESOURCE of the VOTable document.
            The internal data model API (cds.savot.model) can also be used to create VOTable documents from scratch.

Usefull informations about the work which is done around the pull parsing method.

The SAXLike parsing mode (package : cds.savot.sax and cds.savot.common, cds.savot.model is optional)

In some use cases, it can be important to use a SAX parsing mode because it is possible to execute actions in the different steps of the parsing.

In this mode SAVOT does not save the data into memory, the developer has to manage a part of the process.
This mode is also a good solution if the available memory is short or if the VOTable files are very large.
Compared to the Pull mode, it requires often more work on the developer side.

Quick start with SAVOT Pull !

How to start with the Pull Parser ?

* The usual questions...

Which packages ?

You can download these packages in the Download corner

And the CLASSPATH ?

Put the four above packages in the CLASSPATH

Does it work ?

Download one of the samples and execute it

If it works, cheers, if not goto *

To start a basic source code, you must choose in which mode, FULL or SEQUENTIAL, you want to parse the VOTable file.

Example 1 : FULL mode

In this example we show how to create an object which contains the whole VOTable document (FULL mode).
      // the whole VOTable file is put into memory
      SavotPullParser sb = new SavotPullParser(source, SavotPullEngine.FULL); !!! parsing of  the whole source 
System.out.println("Resource name : " + ((SavotResource)sb.getVOTable().getResources().getItemAt(0)).getName());
// get the VOTable object SavotVOTable sv = sb.getVOTable(); !!! sv is now a reference to a VOTable object try {
BufferedWriter bw = null;
if (target != null) {
bw = new BufferedWriter(new FileWriter(target));
}
// for each resource for (int l = 0; l < sb.getResourceCount(); l++) {
SavotResource currentResource = (SavotResource)(sv.getResources().getItemAt(l));
// for each table of the current resource for (int m = 0; m < currentResource.getTableCount(); m++) {
// get all the rows of the table TRSet tr = currentResource.getTRSet(m);
System.out.println("Number of items in TRset (= number of <TR></TR>) : " + tr.getItemCount());
// for each row for (int i = 0; i < tr.getItemCount(); i++) {
// get all the data of the row TDSet theTDs = tr.getTDSet(i);
String currentLine = new String();
System.out.println("Number of items in TDSet for the index " + (i+1) + " tr (= number of <TD></TD>) : " + theTDs.getItemCount());
// for each data of the row for (int j = 0; j < theTDs.getItemCount(); j++) {
currentLine = currentLine + theTDs.getContent(j);
System.out.println("<"+theTDs.getContent(j)+">");
}
if (target != null) {
if (target.compareTo("") != 0) {
bw.write(currentLine);
bw.newLine();
}
}
else System.out.println(currentLine);
}
}
if (target != null) {
bw.flush();
bw.close();
}
}
} ...

! If you want to try this example, execute the PullFullSample2 class

Example 2 : SEQUENTIAL mode

In this example we show how to use the SEQUENTIAL mode

    // begin the parsing
    SavotPullParser sb = new SavotPullParser(source, SavotPullEngine.SEQUENTIAL);!!! parsing starting 
    
    // get the next resource of the VOTable file
    SavotResource currentResource = sb.getNextResource();  !!! get the next resource

    // while a resource is available
    while (currentResource != null) {
// for each table of this resource for (int i = 0; i < currentResource.getTableCount(); i++) {
tr = currentResource.getTRSet(i);
if (tr != null) {
System.out.println("Number of items in TRset (= number of <TR></TR>) : " + tr.getItemCount());
// for each row of the table for (int j = 0; j < tr.getItemCount(); j++) {
// get all the data of the row TDSet theTDs = tr.getTDSet(j);
String currentLine = new String();
System.out.println("Number of items in TDSet for the index " + (j+1) + " tr (= number of <TD></TD>) : " + theTDs.getItemCount());
// for each data of the row for (int k = 0; k < theTDs.getItemCount(); k++) {
currentLine = currentLine + theTDs.getContent(k);
System.out.println("<" + theTDs.getContent(k) + ">");
}
}
}
}
// get the next resource currentResource = sb.getNextResource();
}
! If you want to try this example, execute the PullSeqSample class


Quick start with SAVOT SAXLike !

* The usual questions...

Which packages ?

You can download these packages in the Download corner

And the CLASSPATH ?

Put the above packages in the CLASSPATH

In this mode the developer must implement a SavotSAXConsumer interface which contains all the methods which will be executed during the parsing.

The developer decided what is done when :
...
See the following example (SavotSAXSample).

Example : SAVOT SAX

In this trivial example, the <VOTABLE ...> attributes, the <RESOURCE ...> attributes and the <TD&gt...</TD> content are printed on the standard output.
import java.util.Vector;import cds.savot.sax.*;public class SavotSAXSample implements SavotSAXConsumer {  public SavotSAXSample() {
}
// attributes is a Vector containing couples of (attribute name, attribute value) // exemple : (attributes.elementAt(0), attributes.elementAt(1)), (attributes.elementAt(2), attributes.elementAt(3)), ... /**
*
* @param attributes Vector
*/
public void showAttributes(Vector attributes) { for (int i = 0; i < attributes.size(); i = i + 2) {
System.out.println(
"Attribute name : " + attributes.elementAt(i) + " Attribute value : " + attributes.elementAt(i + 1));
}
}
// start elements public void startVotable(Vector attributes) {
showAttributes(attributes);
}
public void startDescription(){
}
public void startResource(Vector attributes){
showAttributes(attributes);
}
public void startTable(Vector attributes){
}

...
// end elements public void endVotable(){} public void endDescription(){} public void endResource(){} public void endTable(){}

...
// TEXT public void textTD(String text){
System.out.println(text);
}
public void textMin(String text){} public void textMax(String text){}

...
// document public void startDocument(){} public void endDocument(){}
}

 The following lines must be included in your application :
...
SavotSAXSample consumer = new SavotSAXSample();
SavotSAXParser sb = new SavotSAXParser(consumer, file);
...
The SavotSaxSample consumer will be taken into account during the parsing process.

FAQ

Q: Why not DOM ?

Q : Why different parsers ?

Statistics (SAVOT Pull)

Test hardware configuration :

Test software configuration :
 (*) Windows XP is a trademark of Microsoft
(**) Sun is a trademark of Sun microsystems

These tests have been done with the pull parser kXML

All the VOTable document is loaded into the SAVOT internal data model and is available in memory for access through the API

VOTable files from Simbad database
File
Size (KBytes)
Resources
Tables
Data cells
Parsing time (seconds)
simbad1.xml 9 2 8 64 0.32
simbad2.xml 70 20 109 1009 0.37
simbad3.xml 398 200 747 6831 0.5
simbad4.xml 2854 2000 5821 54515 1.3
simbad5.xml 29360 20000 61747 557944 10.45

VOTable files from VizieR database
File
Size (KBytes)
Resources
Tables
Data cells
Parsing time (seconds)
m31.xml 3260 135 166 189020 1.68
3c273.xml 9634 1 1 639991 3.6

Download and links

Download corner

Here we have put some links pointing to XML parsers which have been tested

kXML TinyXML Xerces Crimson NanoXML


©ULP/CNRS - CDS, 11 rue de l'Université, 67000 Strasbourg, France Question@simbad