Difference between revisions of "SE2006:group bar:todo"

From Earlham CS Department
Jump to navigation Jump to search
(add bugzilla link)
 
(16 intermediate revisions by 3 users not shown)
Line 1: Line 1:
 +
For a current TODO list see [https://www.cs.earlham.edu/bugzilla/buglist.cgi?query_format=specific&order=relevance+desc&bug_status=__open__&product=bar&content= Bugzilla].
 +
 
* 5 data sources
 
* 5 data sources
 +
** energy - done(ish)
 +
*** resultant database has many specific date columns
 +
** water - done(ish)
 +
*** only gets 1990 data. need more dates.
 +
** occupancy - alex - done(ish)
 +
*** could be generalized to get more census data
 +
*** only gets state data not county, region, etc.
 +
** superfund sites - almost done (Aybars & Kevin)
 +
*** we ran into trouble with non-uniform data that made parsing a headache.  will try to resolve or find another source.
 
* scrAPI:
 
* scrAPI:
** proper javadoc
+
** API to data store
** testing suite
+
*** need to add normalization procedures to SourceDefinition classes
** figure out group skipping
+
*** may serve up normalized XML data via [http://tomcat.apache.org/ tomcat] (this way we can take leverage all the source definition/management that we've already done in Java).  PHP or Perl are also possibilities (simpler but less robust).
 +
** HTTP POST support.
 +
** testing suite - colin & alex
 +
*** done (Source,Schema) needs more coverage
 +
*** move test source(s) to cvs & setup bar web directory
 +
*** need documentation for testing suite. - alex - done
 +
** proper javadoc - Aybars - done a bunch needs more coverage
 +
** exception handling
 +
*** (asserts, etc.)
 +
** Source Management - toby - done
 +
***Scheduler - toby - done
 +
***sourceManager - toby - done
 +
***sourceDefinition - toby - done
 +
** source stream  - toby - done
 +
** single place for var name to sql type - done
 +
** figure out group skipping - done
 
*** to skip a group, put a '?' as the first character in that group, e.g., (?.+)
 
*** to skip a group, put a '?' as the first character in that group, e.g., (?.+)
 
*** see this [http://www.amk.ca/python/howto/regex/regex.html#SECTION000530000000000000000 awesome regexp tutorial] (specific to Python but talks about Perl as a baseline too)
 
*** see this [http://www.amk.ca/python/howto/regex/regex.html#SECTION000530000000000000000 awesome regexp tutorial] (specific to Python but talks about Perl as a baseline too)
** single place for var name to sql type
 
** exception handling
 
*** (asserts, etc.)
 
** auto (periodic) fetching of data
 
*** make ScrAPI run as a daemon?
 
** thread support (scrape multiple sources simultaneously)
 
 
* database
 
* database
 +
** input - done
 
* geocoding
 
* geocoding
 +
** waiting on database from group fu

Latest revision as of 22:49, 11 April 2006

For a current TODO list see Bugzilla.

  • 5 data sources
    • energy - done(ish)
      • resultant database has many specific date columns
    • water - done(ish)
      • only gets 1990 data. need more dates.
    • occupancy - alex - done(ish)
      • could be generalized to get more census data
      • only gets state data not county, region, etc.
    • superfund sites - almost done (Aybars & Kevin)
      • we ran into trouble with non-uniform data that made parsing a headache. will try to resolve or find another source.
  • scrAPI:
    • API to data store
      • need to add normalization procedures to SourceDefinition classes
      • may serve up normalized XML data via tomcat (this way we can take leverage all the source definition/management that we've already done in Java). PHP or Perl are also possibilities (simpler but less robust).
    • HTTP POST support.
    • testing suite - colin & alex
      • done (Source,Schema) needs more coverage
      • move test source(s) to cvs & setup bar web directory
      • need documentation for testing suite. - alex - done
    • proper javadoc - Aybars - done a bunch needs more coverage
    • exception handling
      • (asserts, etc.)
    • Source Management - toby - done
      • Scheduler - toby - done
      • sourceManager - toby - done
      • sourceDefinition - toby - done
    • source stream - toby - done
    • single place for var name to sql type - done
    • figure out group skipping - done
      • to skip a group, put a '?' as the first character in that group, e.g., (?.+)
      • see this awesome regexp tutorial (specific to Python but talks about Perl as a baseline too)
  • database
    • input - done
  • geocoding
    • waiting on database from group fu