I think I would need to also include wikidata ontology db's ontology definition set beside wordnet since in nouns ontology side, its not always accurate.
hmm I think wordnet might be nice to start with set of translated verbs (verbs with no hypernyms, root verbs)
but for nouns ontology side, I think I would also look to wikidata ontology db side in that. (of homomorphism constructions).
I think I would use mainly wordnet for initially then verb type Pos and for nouns I would try to create ontology db definition sets from wikidata ontology db.
but the latter requires:
e.g. I previously created subject predicate object tables for storing ontology db of wikidata. then I need to process that data to convert to single language and preprocess to e.g. create layers of ontology tables and dictionary dbs which map noun to its hyponyms (previous ontology layer) etc.
hmm I think I would then rather today focus on verb translations and conversion from dependency graph to knowledge graph (neo4j) and set of translation of verbs there to sage wise definition sets.
hmm then seems as I need to do some big data processing on ontology db of wikidata to create single language clear precise ontology information dbs/dictionaries etc. Actually there is sparql based http service either to query ontological queries but that requires initial ontology structure knowledge either imho.
hmm nice that that db is ready only needs to be preprocessed. I dont remember if the spark task finished. maybe spark task might failed. but its at least consecutive data so that it could continue from where notebook for regarding last subject object predicate statements or maybe it processed all data already. I dont exactly remember. I would check that after I process with checking the very last added partition to s3 file system to see the last ttl entry that were processed then to later run a similar task later to add missing ttl entries if there is.
its as i remember petabytes of data. hmm due to mul;tiple languages. it would reduce alot when set to single language.
hmm I am happy that I accomplished that predicate/subject/object parquet tables before :) (if incomplete that task would restart from last processed ttl entry) (it really were a daunting data engineering task to process that, petabytes data and its ttl needs to be preprocessed via jena like frameworks.)
hmm so since ontology of wordnet is not very suited for nouns side, hmm that task postponed to later with using wikidata ontology db.
hmm now then: some tedious task of conversion of 557 or more verbs (verbs with none hypernyms) or some set of that to its sage wise definitions.
hmm lets start these activities of translating verbs to homomorphism definitions in sage wise definitions I mean textual compilable sage code stored as knowledge graph in neo4j nodes. with even also contextual node information etc.
hmm lets do this sentences of paragraph to knowledge graph conversion task first today.
Yorumlar
Yorum Gönder