Juan Manuel Caicedo Carvajal is sharing code with you
Bitbucket is a code hosting site. Unlimited public and private repositories. Free for small teams.
Don't show this againspanish-wordnet overview
Recent commits See more »
| Author | Revision | Comments | Message | Labels | Date |
|---|---|---|---|---|---|
|
|
afee5f3b86ad |
Updated docs. |
|
||
|
|
cb54cec6f15c |
Repository configuration. |
|
||
|
|
42e563a13699 |
Script for importing the translations into a database. |
|
||
|
|
aa72df31dd5b |
Initial version. |
|
===============
Spanish WordNet
===============
Tools for converting the Spanish translation of WordNet to the file formats
used for the English version.
Why:
Because this allows developers and researchers to use the WordNet database with
the tools available for different platforms and programming languages.
How:
By converting the XML files of the Spanish translation of WordNet into the
common file formats.
Note:
This is a work in progress.
Last update:
2012-01-10
Ideas
=====
1. Download the English WordNet and edit it with extJWNL, replacing the words
with the translations.
- Start from the attribute wn_synset.spa for renaming the lemma.
http://www.zentus.com/sqlitejdbc/
- Translate the gloss word by word:
wn_gloss
wn_trad
wn_sk (translation for verbs, nouns, adj)
wn_variants
Notes
-----
- I wrote a script to `tools/dbimport.py` to create a database from the XML
files with the Spanish translations. The database contains one table per file,
one row for each 'row' element and using the attributes of the 'row' element
as columns.
The script is written in Python and uses SqlAlchemy as ORM.
- Several tables refer to WN elements using the sense keys. They are described
in the WN documentation:
http://wordnet.princeton.edu/man/senseidx.5WN.html
NLTK supports looking up a lemma from the sense key using the method:
wordnet.lemma_from_key('.22-caliber%3:01:00::')
Here is the documentation of the NLTK WN module:
http://nltk.googlecode.com/svn/trunk/doc/howto/wordnet.html
- extJwnl, the Java library for WN, provides a command line tool that can run a
script with the modifications to the database:
usage edit: ewn -script filename
filename contains edit commands as above, one sensekey per line.
For example:
goal%1:09:00:: -add -addword end -setgloss "the state ... achieve it; ""the ends justify the means"""
n#oxen -addexc ox
- The class Dictionary from extJwnl provides methods to get a `Word` object by
the sense key (`getWordBySenseKey`) and to edit the dictionary (`edit`, `save`,
`close`). These methods could be used to update the synset glosses with the
Spanish translations.
References
==========
extJWNL
http://extjwnl.sourceforge.net/
Spanish translation of WordNet
http://grial.uab.es/tools/download/
License and Authors
===================
See LICENSE.txt and AUTHORS.txt.
Contact
=======
Juan Manuel Caicedo
http://cavorite.com
juan@cavorite.com