DocTables in 'raw' format
TBWiki includes a feature for very easy creation and editing of simple databases. These are referred to as "tables". {{TableOfContents}} = Table Data Formats = Table data is stored in regular tbwiki pages. [FIXTHIS - maybe change this to a database directory?], or it can be parsed from an external source. For data in tbwiki pages, you can edit the data in a table via the normal tbwiki page editor, or using table forms. There are several types of tables, which vary depending on their storage format: * '''MoinMoin''' - tb_type="moinmoin" * data is in a separate page (or block on a page), and is specified in MoinMoin syntax (using double vertical bars to delimit fields) * '''attribute files''' - tb_type="attr_files" * multiple files, with one table record per file * data values for the record (field values) are listed as name-value pairs * at least one attribute file must exist * attribute files can list different fields (that is, they can be sparse) * HOWEVER - every attribute file must have a 'record' field, which corresponds to 'record_id' in other places. * '''inline''' - tb_type="inline" * places the data in moin-moin format inline in the same processor block used to declare the table * '''parsed data files''' - data tb_type="parsed" * data is parsed from an external source using using the scanning information provided in a "match_spec" * this uses the ''data_scan'' module * this is currently a read-only data source (no record editing allowed) * '''tbwikidb files''' - data db_type="tbwikidb" * data is parsed from tbwiki pages specified using a source_spec * set match_spec="<tbwikidb>" * must have a a page named "<base_name>template" to use to add new pages * example: _bug_template * '''test data''' - db_type="test_data" * this is a special set of test data internal to the table.py module. It is used for testing only. == MoinMoin Tables == MoinMoin tables are stored in a single file. The records are stored one per line, with the field names in the first row of the table, and the record_ids in the first column of the table. Sample source spec: source_spec=PhoneDatabase {{{ ||Name||Phone|| ||Tim ||555-1212|| ||Nadine||444-2323|| }}} == Attribute files == Attribute files are a collection of files, each containing a single record, where the fields for each record are stored as name-value pairs. Sample source spec: source_spec=Book.* Any file starting with "Book" would be considered a record in the table. Attributes are listed in the file as name=value pairs. Values can be single-line or multi-line. Multi-line values are The first attribute of the record must the name: 'record', which should correspond to a unique identifier for records in this table. Internally, this is referred to as the record_id. The names in the records are NOT required to be perfectly consistent between records. That is, one record may have a 'phone' attribute, while others do not. For purposes of display and editing, the set of fields for the tables is the union of all fields encountered in all records in the set of files in the database. == Inline Tables == Inline tables are stored inline on a regular tbwiki page. They currently only support moinmoin table syntax, and can be either named or unnamed. An inline table is embedded inside a #!Table processor block, and each line of the table has fields separated by double vertical bars. Here's an example: {{{ {{{#!Table ||Name||Job|| ||Tim||engineer|| ||Fred||pilot|| }} } }}} Only one unnamed table may be used on a single wiki page. Multiple named tables may be used on a wiki page. A named table uses a block with a name. {{{ {{{BlockName <data> }} } }}} Also, the table processor declaration can include a block name: {{{ {{{#!Table:BlockName <data> }} } }}} == Parsed data Tables == Using the ''scan_data'' module, you can parse data from multiple sources. These can be tbwiki pages, local files, or web pages. Parsing data from external web pages (scraping them) may introduce long delays in displaying a table. This produces a read-only database. Here is a sample table declaration for the FIXTHIS list for the tbwiki software: {{{ {{{#!Table source_spec=/home/tbird/work/tbwiki/cgi-bin/.*[.]py$:/home/tbird/work/tbwiki/cgi-bin/plugins/.*[.]py$ match_spec=""" record_start=.*FIXTHIS description=.*FIXTHIS[ -]*(.*)$ file=%%(basename)s line_no=%%(line_no)s """ cols=file:line_no:description sortby=file:alpha,line_no:int }} } }}} For more information, see [[DocParsedTables]] == TBWikiDB Tables == This reads a database from regular tbwiki page files in the current wiki. The source_spec for such a table should be a page name prefix, followed by a wildcard (.*). And the match_spec for the table definition should be the exact string "<tbwikidb>". (see source_spec below for more details) {{{ {{{#!Table source_spec=Bug_.* match_spec=<tbwikidb> }} } }}} DB fields are parsed from each page, with a definition creating a single-line value and a level-1 section creating a multi-line value. Here's an example format: {{{ ; Summary: summary of item ; Responsible: Tim = Description = This is a description of the first item, and can have: * regular tbwiki markup }}} To add records, you need to make a "template" page, from which new pages will be created. This is a regular tbwiki page, with a name that ends in the word 'template'. When a user selects "Add Row", a new file will be created using the filename prefix and the wildcard replaced with a record id, with the initial contents being from the template. In the example provided above the template would be named "Bug_template". = Table type specifiers = The following fields, that are part of the table configuration, determine how the table is parsed: == source_spec == For the source_spec, if there is a leading slash, the spec indicates local files (in the server file system). (This is a huge security issue!) If there is no leading slash, the spec indicates pages in the tbwiki data area. This spec can contain wildcards. (At least, trailing wildcards.) If the spec contains a colon, ':', then the first part is the page name and the second part is a block name. If the table name is '#', then the current page is used as the page name. If the string starts with "http", then it indicates a URL for a web page from which the data is loaded. Here are some examples: * source_spec=/home/user/big/security/hole/file - read table data from the indicated file in the host filesystem * source_spec=page_name - read table data from page "page_name" in the set of tbwiki pages * source_spec=page_name:table_block_name - read table data from the block named "table_block_name" in the page "page_name" (from the tbwiki pages). * source_spec=page_name_.* - read table data from multiple pages - from all tbwiki pages that start with "page_name_" * source_spec=http://domain.com/foo.html - read table data from the indicated web page (this is usually used with a match_spec to parse the data on the page into records). * source_spec=/tmp/file1:/tmp/file2.* - read table data from multiple files in the local filesystem (usually used with a match_spec to parse the data in the files into records) == match_spec == A match_spec is used to describe the regular expressions which are used to gather the record fields and values from the data. The match_spec is a specified with a set of lines, one-per regular expression, enclosed in triple-quotes (three double-quotes in a row). To identify the beginning of each record, the "record_start=" special variable is used. Named fields are parsed according to the regular expressions given in the match_spec. By default, each record will have the named fields, and those will be displayed in the resulting table. Each record also has a 'record_id' field, which will have the same value as the first field listed in the match_spec. See [[DocWebSed]] for details about pattern matching and regular expressions that are supported by the ''data_scan'' module. ''basename'' and ''line_no'' are special values which hold the name of the file being read and the line_no in the data where the record_start is found. These can be assigned to arbitrary field names for the record, and used in the table. = Declaring a Table = Table data is stored in one or more files (or parsed from some other source). To show a view of a table on a page, you create a table declaration, usually with a processor block: == Macro Syntax == Here is a sample macro referencing a simple table: {{{ {{Table(Table1Data)}} }}} == Processor Syntax (with examples) == Here is a sample processor syntax for a simple table (with data specified in an external file (another page): {{{ {{{#!Table source_spec=Table1Data }} } }}} Here is the sample processor syntax for a multi-file table: {{{ {{{#!Table source_spec=bug_.* }} } }}} You can also specify data in a named block on either the current page or another page: {{{ {{{#!Table source_spec=#:DataBlock1 }} } }}} {{{ {{{#!Table source_spec=DataPage:DataBlock1 }} } }}} = Table configuration = Table configuration is specified as a list of attributes in name/value pairs, using standard tbwiki configuration syntax (use triple-quotes for multi-line values). Here are some of the attributes you can use with the Table processor: * '''source_spec''' - specifies the source of the data * '''match_spec''' - used for parsed-data and tbwikidb tables * see below * '''cols''' - specify the columns of the table to output (and their order) * e.g. cols=field2:field1:field4 * '''sortby''' - specify the sort order for displayed records * multiple fields can be specified, along with their sort type * e.g. sortby=age:int,name:alpha * sort types are: * '''alpha''' - alphanumeric order (default) * '''rev_alpha''' (reverse alpha) * '''int''' - integer order * '''rev_int''' * '''month''' - by month name * '''rev_month''' * '''possible_values''' - by order in possible_values list * '''rev_possible_values''' * '''field''' - specify field type information (multiple declarations allowed) * these are specified as one or more "field=" statements, using standard attribute syntax * each "field=" line begins a new field type declaration * see below for details * '''row_filter''' - specify an expression used to filter records in the table * you can test for string equality, numeric inequality or a regular expression match * the format of the row_filter is: row_filter=<expression>, where <expression> consists of <field_name><operator><value> * No extraneous spaces are allowed between the word 'row_filter=' and the expression, or between the field_name, operator or value * this allows for testing values with leading spaces * the operator can be one of: =, ==, !=. <, >, <=, >=, *= * '=' and '==' are used for string equality tests: * ex: row_filter=name=Tim * either != or <> can be used for string inequality tests: * ex: row_filter=status!=done * <, >, <=, >=, <> are used for numeric inequality tests * ex: row_filter=age>=30 * ex: row_filter=pressure<>10 * *= is used for regular expression matches * ex: row_filter=status*=closed|fixed * ex: row_filter=uses_linux*=[Yy]es * ex: row_filter=uses_linux*=.*es * if the value has both a leading and trailing double-quote, they are removed * ex: row_filter=name=="spaces at end " * '''add_form''' - specify a form used for adding a new record to the table * see [[TestTableUser]] * '''edit_form''' - specify a form used for editing an existing record in the table * '''conf''' - specify the configuration in a separate block * see below for details * '''query''' - specify a block containing query attributes * Usually, this specifies a block of configuration information, along with a row-filter, to generate a particular view of the data * see below for details * '''is_spreadsheet''' - set to 1 to indicate that the table contains formulas that should be evaluated * '''show_sort_links''' - set to 0 to indicate that sort-links should NOT be shown in the table header. This refers to the small caret that links to a version of the table sorted by that column. * '''show_edit_links''' - set to 0 to indicate that edit links should not be shown as part of table output. This refers to the links underneath the table that allow someone to edit items in the table. This is mostly useful for inline tables that are for presentation only. That is, that are to be edited as part of wiki text, and not with a table form. * '''show_as_markup''' - set to 0 to show data in each cell of the table as plain text. Defaults to 1, which indicates that the table data is processed as tbwiki markup. * '''show_template''' - set to 1 to show the template file in a tbwikidb (so it appears in the table). By default the template file is not shown in the table. == field attributes == You can specify the attributes of individual fields, so that TBWiki can process them properly (generate appropriate default forms, and display the values properly). field attributes are: * '''field''' - specify the name of the field for which attributes are being declared * '''type''' - specify the type of the field * possible types are: Text, Textarea, Select, Checkbox, Radio * '''default_value''' - specify a value to pre-populate in the field in an Add form * '''possible_values''' - list of possible values for the field * specified with a multi-line value, one value per line * '''colors''' - list of specifiers used to define the background color for a field, based on the field's current value * see below for definitions and syntax === dynamic default values === There are some special default values that allow you to automatically populate a field differently for each new record created. To use a dynamic default value, use one of the strings below: * %%(timestamp)s - yields a string in the format: yyyy-mm-dd_HH:MM:SS.SS * %%(next_int)s - yields a value one higher than all other integer values in this field in the table already * %%(next_page_num)s = yields a number one higher than all other page numbers already used for this table (for attrdb and tbwikidb databases only). This number is used in the filename to create a new file. === color specifiers === The table configuration can provide a list of color specifiers that indicate what the background color of a particular cell should be, based on the value of the field in that cell * the 'colors' field attribute is used to provide a list of tests and colors * each line in the list is of the form: * test:color * these list of tests are matched against the field value, and if a match is found, the background color of the cell for that item is shaded in the specified color * a tests consists of either a word or phrase, or a python expression enclosed in parenthesis, using the term 'value'a colon * an expression can be a single word or phrase, in which case if the field value matches the word or phrase exactly, the test passes (and the color is used) * ex: {{{ colors="""good:green bad:red more than one word:yellow""" }}} * colors can be color names or hex triples * expressions can be any valid python expression * expressions can include boolean operators (not, and, or) * expressions can test for equality or inequality * ex: (int(value)<20):blue - show a blue cell for items with value<20 * ex: show "good" cells in green and "bad" cells in red {{{ colors="""(value=="good"):green (value=="bad"):ff0000""" }}} * ex: show particular values in yellow {{{ colors="""(int(value)>5 and int(value)<10):00ffff""" }}} * ex: show some words in orange {{{ colors="""(value in ["lemon", "banana"]):orange (value in ["apple", "cherry", "stawberry", "tomato"]):red""" }}} ==== field attribute samples ==== Field types and possible values: {{{ {{{SampleDataConf field=Summary type=Text field=Description type=Textarea field=Done type=Select possible_values="""Yes No""" field=Happy type=Checkbox field=OS type=Radio possible_values="""Linux BeOs DOS Windows""" default_value=Linux }} } }}} Color Expressions: {{{ {{{#!Table source_spec=#:ReactorTable field=reactor_temperature colors="""(int(value)<10):ffffd0 (int(value)>=10 and int(value)<50):ffff80 (int(value)>500):ffd0d0""" }} } }}} == Configuration blocks == You can specify several "configuration" attributes of a table, and use them repeatedly (i.e. from different processor blocks), using the "conf=" attribute Configuration blocks are used to describe any of the attributes of the table. == Query blocks == You can specify a "query" block, which has a set of table attributes (usually including a row_filter) to show a subset of the data. These can be referenced via a URL, to show different views of the same data. = Views = To show a table on a page, you use the table processor, which has the format: {{{ {{{#!Table <source spec> }} } }}} You can also specify view a == Custom forms and views == You can create custom views of table data, by creating a view file. === View === To create a custom view for a record, ... * line_edit_form_spec, get_default_line_edit_form_spec * there is no way to do a custom view right now. * see def html_string() * should support non-table views? * need view_header, view_record, view_footer definitions in config * view_record needs could have: {{{ view_record_form=""" <tr><td>%%(nameField)s</td><td>custom text...%%(phoneField)s</td></tr> <tr><td>next line:%%(addr1Field)s</td><td>%%(addr2Field)s</td/</tr> """ }}} * or, not in a table format: {{{ view_record_form="""Name: %%(nameField)s<br> Phone: %%(phoneField)s<br> Address: %%(addr1Field)<br>%%addr2Field) <HR> """ }}} Maybe need to use "Value" instead of "Field". Field is used for edit forms. With a different keyword, you could have a mix of read-only and editable fields. This is desirable. Example: {{{ add_form="""Name: %%(nameValue)s<br> Phone: %%(phoneField)s<br> """ }}} === Add Form === To create a custom form to add a record, use the "add_form" attribute in the configuration for a database. This is (almost always) a multi-line attribute in the table configuration, and consists of the HTML used to display the edit fields for the record. Edit fields are placed in the HTML wherever a named python format string occurs, of the type: "%%(<field-name>Field)s" or "%%(<field-name>File<field-type>)s" Ex: To create a custom add form for the Book.* database, add the following attribute in the DBBookConf file, along with the other config: In this file, put HTML (FIXTHIS - should allow tbwiki markup also) to display the form. Whereever you want the form input element for a field of the record, you can place a python named string format item. Here is an example showing fields in an HTML table. {{{ add_form="""<table> <tr> <td>User name:</td><td> %%(record_idField) s</td> </tr><tr> <td>Phone number:</td><td> %%(phoneField) s</td> </tr><tr> <td>Description:</td><td> %%(descriptionFieldTextarea) s</td> </tr><tr> <td>%%(ok_flagFieldCheckbox) s Is OK?</td> </tr> </table> %%(SaveButton) s %%(CancelButton) s<br> """ }}} Note that any default values specified in the database config will be pre-filled in, in the form. Any values that don't have defaults in the configuration will start blank. === Edit form === An edit form is specified the same way, but using the configuration attribute "edit_form". Note that the field values in the form will be pre-filled in with values from the record edited. == Field format strings for edit forms == Field format strings are python named format strings. The name of the python variable consist of the field name, followed by "Field", followed by the type of form element to display. If no type is specified, (either in the field format string, or in the database configuration) a single-line text entry box is used (that is, an HTML INPUT form element). Possible types are: * Text * TextArea * Checkbox * Select * Radio If something besides one of these types is specified, it is used as the INPUT field type in the HTML file element. == Special form elements == The following are special form elements that can be placed anywhere in the form specification: * SaveButton - this shows the button to submit the form data * CancelButton - this cancels the operation and doesn't save the form data = Spreadsheets = Tables may contain formulas which are calculated at runtime to produce a final value for a "cell". In order for the table processor to process formulas, it needs to know that a table is to be interpreted as a spreadsheet. To do this, add the attribute is_spreadsheet=1 to the table configuration (usually this is the first line after the #!Table processor directive at the top of the block). == formulas == Formulas can consist of simple arithmetic, or functions. Formulas operate on data obtained from other cells, including individual cells and ranges, and on literals. A formula always starts with an equal sign ('='). Here is a very simple formula: '=2+2' Currently supported functions are: * SUM(<range>) - calculates the sum of the numbers in the indicated range of cells * COUNT(<range>) - show the number of cells in a range that have numeric values * COUNTA(<range>) - show the number of non-empty cells in a range * AVERAGE(<range>) - show the average for cells containing numeric values in a range of cells * AVERAGEA(<range>) - show the average for cells containing non-empty values in a range of cells (TRUE is calculated as 1, any other non-numeric value is 0, but counts towards the count of cells) * MAX(<list>) or MAX(<range>) - show the maximum value from a list of values or a from a given range * MIN(<list>) or MAX(<range>) - show the minimum value from a list of values or from a given range FIXTHIS - add support for COUNTIF == cell addressing == To unlock the power of formulas, you will want to use values from other cells. Currently, the spreadsheet only does a single pass over the data in the table, going row-by-row, in the order of records in the table database internally. formulas may refer to rows and columns that precede the formula in the table. cells are looked up by row name and column name, with special syntax to allow for referring to the current row and current column. A cell reference starts with a dollar sign ('$'), and then has the row identifier followed by the column identifier. Let's say we had the following table: {{{#!Table:example1 ||item||count||price||total|| ||paper||5||.07||=$paper.count * paper.price|| ||pencil||2||.50||=$pencil.count * $pencil.price|| }}} The row identifier is the name specified as the record_id, or first cell in a row. The column identifiers is a name specified in the first row of the cells. You can omit the row identifier OR column identifier, but not both. Thus, the formula in the last cell in the above table (the one that contains "=$pencil.count * $pencil.price") could be written more shortly by referring to the current row. That is, you could omit the current row name (pencil) and re-write that cell as: "=$.count * $.price", with the same meaning. An empty row identifier means "the current row", and an empty column identifier means "the current column". Notice that this allows the same formula to be used in multiple rows. Also since formulas use names, they don't have to be adjusted if more rows or columns are added to the table. They do have to be adjusted, however, if the names that are referenced are changed. (That is, if you changed the column name of "price" to "cost", you would need to adjust any formulas that used that column name.) You may also refer to cells by absolute or relative position in the table. The absolute position depends on the row_output_list and col_output_list. That is, it may depend on your query. Rows and columns are numbered from 0. Row absolute references must be exactly 5 digits long, with leading zeros. So $00003.5 would refer to the cell in the fourth row and sixth column in the table. Usually, the first row is the header, and the first column is the record key. You can think of the numbering as being 1-based, within the actual data portion of the table. In the sample table above, the address $00001.1 and $paper.count refer to the same cell (with value of 5). You can refer to cells with relative references as well. Use '-' or '+' as the prefix to indicate a backward reference or forward reference, respectively. Here are some examples of relative references: * '$-1.' indicates the cell one row previous (the cell just above the current one) * '$.-1' indicates the cell one column previous (the cell just to the left of the current one) * '$+2.-1' indicates a cell 2 rows down and one column back from the current cell. Functions often operate on ranges. A range is just two cell references, separated by a colon. Like so: {{{ =SUM($paper.total:$pencil.total) }}} Currently, ranges are evaluated left-to-right and top-to-bottom. == examples == See [[TestSpreadsheetStuff]] for examples. = Things to Describe = * sequencing of records * filtering * sorting * views * default values * spreadsheets * match_spec conf description and examples == not implemented yet == Support generated default values for fields: * %%(cur_date)s * %%(next_int)s