New Output

2 Jan 2012

Only spammers seem to be noticing this blog, but for web-trolling software that might be interested in digital humanities and philology I thought I might add that I have updated the sample output from Collatex.

collatex-table-apparatus.html shows output from user-specified witnesses in the form of (1) an alignment table based on user-specified order, (2) an extracted text of a base text (taking the first specified witness is the base text), (3) generating an apparatus.

CollateX is not perfect. Some of the output problems are the result of tokenizing (the samples used were tokenized very coarsly) and can be fixed. Abbreviations and the phenomenon of connected or unconnected prepositions (של, also words such as כיצד) can also be fixed. But some errors have to do with how CollateX deals with with edit distance. Not sure how we are going to handle this.

Tags: , , ,

One Response

  1. Pingback: New Output | Maryland Institute for Technology in the Humanities

Leave a Reply

Your email address will not be published. Required fields are marked *