DocParsedTables

A "parsed table" is one that is read using the scan_data module, (the same module used by the WebSed processor) to create a list of records.

This is similar to a tbwiki table, where the table fields are read from level-1 sections and definition lists. But generic parsed tables are completely free-form in their format. Due to this, there is no mechanism to write the data back. A parsed table database is always read-only.

source [edit section]

The source_spec for a parsed table can be wiki pages, absolute paths, or external URL references. They may contain wildcards to read more than one file. They can be lists of specifications (separated by colons).

Note that parsing data from external web pages (scraping them) may introduce long delays in displaying a table.

match_spec [edit section]

See DocWebSed for more information about the data_scan module and how to declare regular expressions for finding data in a page of text.

Here are some miscelaneous rules for match_spec:

examples [edit section]

Fixthis table [edit section]

Here is a sample table declaration for the FIXTHIS list for the tbwiki software. It reads the code files (.py files) looking for FIXTHIS entries. in the code itself.

{{{#!Table
source_spec=/home/tbird/work/tbwiki/cgi-bin/.*[.]py$:/home/tbird/work/tbwiki/cgi-bin/plugins/.*[.]py$

match_spec="""
record_start=FIXTHIS
description=FIXTHIS[ -]*(.*)$
file=%(basename)s
line_no=%(line_no)s
"""

cols=file:line_no:description
sortby=file:alpha,line_no:int
}} }

busybox page scrape [edit section]

Here is an example of scraping the busybox mailing list archive summary page to extract the size of the archive for each month (giving an approximation of community activity level for that month).

{{{#!Table
source_spec=http://lists.busybox.net/pipermail/busybox/
match_spec="""
record_start=.*<tr>
year_month=href="(.*)[.]txt[.]gz"
year=href="(20\d\d)-.*[.]txt[.]gz"
month=href="20\d\d-(.*)[.]txt[.]gz"
size=Gzip'd Text (.*) KB
"""
cols=year_month:year:month:size
sortby=year:alpha,month:month