wov there are ~ 996461580000 rows actually 996 billion rows near to 1 trillion rows!
So I am trying to figure out economic processing of this much rows. I mean like that when constructing the ontology db main table.
hmm. today's tasks be as:
- hmm starting another test with this many rows to see how many cpu hours some row count is processed to have estimates.
The only just writing them to cloud storage took nearly 4 hours. Reading partitioned wise and taking the ontology info and storing back to another cloud table might take at least two times of hours. hmm I dont spawn a huge cluster currently e.g. 3 task nodes alike atmost instead of initial 30 nodes 50 nodes cluster but go more iteratively with 3 nodes cluster since I already have time for this table task due to other tasks that needs to be accomplished. hmm so first i would check how many hours this some partition processing happens (data is partitioned now already) to decide on the cluster size to scale out. since trillions rows is not that much less data when trying to do not utilize lots of cpu hours.
hmm so today's task be as:
- spawning some partitions to investigate how many partituions 3 task nodes can consume at once and how much time it takes to do process and create the subject predicate object and partition id initial table. such config tests.
-- hmm either one of following activities:
- I already started to study to neo4j which seems like an amazing graphdb for these type of ontology storage either.
but of course i would limit the stored ontology dbs based on ontological layers since 1 trillion rows is also too much for to be stored as a single neo4j db.
hmm also other task to do is:
either check the dependency parser python with spawning this time jupyter notebook in aws to run that dependency graph extractor and also try out spacy tool also. hmm
then other thing to continue is : studying to topology to define the RDF of the topology vocabulary which verbs would be defined translated with and start checking some verbs possible topological definitions in such grammar.
so today would be interesting. hmm i would first ever try a jupyternotebook with some ml code that which i had not tried for a lot time. i forgot pandas / numpy libs alot. I were doing mostly data engineering and not pandas/numpy based machine learning engineering. so quite forgot about but would revise.
but first most important is the in tandem study of creating the ttl db of 1 trillion records in some partitioned spark table first. then would later do some other such runnings to create partitioned neo4j dbs i mean partitioned by topological boundaries. those initial db would be very important.
ayyy i am so impatient to have the very first version of ai ready. but I would need to do translate a huge verb set to its topological rdf definitions that would be also interesting.
hmm since i am not linguist nor neural network expert nor nlp expert, I might not do design the best algorithms/methods but I think these algorithms even not expected to be the most efficient or any efficient would work as expected wishfully.
yayyyy so much exciting project.
lets see as its design evolves, would june 1st of release of first version of ai is still legit? since design has changed :D but i think its still might be viable to release the very first version on June 1. (i mean 0.1 version and 1st version is expected to be ready by latest november 1 ) (it would be an ai that would be able to do generative science and read/understand science books type ai project to fastly build the particle physics lab/ and also there i think i would also design a quantum computer and a medical simulator on that quantum computer to invent some nano tech to heal illnesses that is already done by cuirrent medicine but i wish to devise mine ai doctor since it sounds ncie to solve illnesses with nanorobots completely. i mean with some particle physics fields the ai controls and particle physics anbother field that ai can scan and then move the nano robots with fields and heal whatever whichever illness with nanorobots alike tech i wish to build e.g. to heal problem in my veins like illnesses e.g. or dad's cancer also also cancer illness etc. actually goal is healing all illnesses with medical simulator (molecular simulator) to run on invented quantum, computer to devise medicime with such simulator and doing some generative science to devise ai based control of nanorobots to heal illnesses inside body e.g. veins etc etc. and also we need to build a new tomography device that would be none invasive and would be invented from particvle physics labs studies yepp. so goal is to invent some Star Trek universe like tech later. actually not very later. i would fastly iterate on these projects one by one after ai 1st versaion is readied by nov 1. yupp then following goal would be design of quantum algebra and building of quantum computer(and design of) to write medical simulations to do task nanorobots to heal any wound e.g. vein is wounded? organ is wounded? dna/rna is wounded? hmm gioal is to steer nanorobots via some field (particle physics field) that the ai would control to do molecular engineering based targeted healing to the wound. etc etc and we need a very less than nanometers scanner tech to see scan dna/rna to solve cancer illness e.g. a cell over mutating scanner would detect its position and nanorobots would heal the dna/rna problem of the cell that it wont endlessly do mythosis. alike alike. so these all would be very fastly built one by one after ai's first version is ready. but first: quantum algebra design to design with generative science capability of ai and then quantum computer design also and quantum computer implementation methods to build and install/run software to it to then place particle physics/moleculer engineering information to the computer as a simulator and in tandem at this point particle physics lab would be built. to investigate theoretics experimentally of particla physics. )
these all might take some time of course. but i give lets give nov 1 for first ai version. then quantum computer design/algebra of it design by lets say 5 months since i would also interpret/translate ai the science terminology there to do generative science with ai's help. (actually ai would try to invent new algebras new groups to do compute more effectively also but also quantum algebra's pitfalls also to solve and to design one efficient quantum computer type easy to build then employ it as a particle physics/molecular engineering simulator with tcurrently hypothesized theories of that fields, to invent/test those menitoned nanobot tech to heal any wound/any illness). this needs to be urgently built. most very most important most urgent project is this.
space ship flying tehcbnology would be later invented or at the same time other ai version might investigate that also. hmm yepp why not do iterate these at the same time since when quantum computer is ready then we can also build another quantum computer for that either. hmm and keep ai investigating quantum gravity or cosmology like topics to try to figure out technology that could use gravity force to fly spaceships somehow. maybe it miguht be possible. or space time congruities or incongruities or altering of space time fields to do move space ship some how alike unknown if could be invented like topics such other quantum computer having ai could investigate those in tandem whilst the other ai is focused on devising such defined medical tech. yeppp.
so after November 1. first task would be initially about quantum algebra/particle physics and a quantum computer design tasks and to build then then ai would be duplicated to investigate medical tech task goals and the other gravity/space time related space ship tech (hypothetical its not known if such technologies could be ever invented) would be in another ai instance. so would duplicate ai to two ai's to focus on two such tasks by then after hmm 4st month of 2024 i guess by then i guess quantum computer design/simulators would be readied by ai.
hmm so its a very fast track project its very urgently needed. (e.g. since i have illness in digestive system, medical technology side of project to me is specifically very urgently needed to be built)
yayyyyyyy so its such an exciting project :) with such exciting project goals like such devices/tech which would be like leaping to another century (or many decades instead of century) like technological leap via ai it would be i guess :)
very exciting imho :) to be in verge of edge of such technological iterations.
its not only my project, there are lots of people doing developing ai projects so i guess multiple people would invent all these in 2024s 2025s i mean not only my [project group it is expected that this technological innovations would happen also from multiple of other groups. then it would be alike exponential advance of technological capabilities via ai in a short period alike. it would be then by like Star Trek's universe i believe in 2 or 3 years, the technology level would reach near to Star Trek universe (via ai/ if without ai it wont be possible to iterate that much fast. but of course ai wouldn't be able to invent anything if people had not already crafted nice cosmology/particle physics theories as baseline the ai would start with. i mean without people's innovation baseline ai wouldn't be able to accomplish anything. but it would with that baseline accomplish very much things in very shorter periods, in a short time period, technology level would advance to Star Trek universe's technological level all by ai's capability prospects. )
i currently work as a one person so my ai group is currently one person but in future, would i anticipate/accept other engineer people to my ai project? i dont know if thats necessary either. since after first version, no extra effort /help would be necessary by any engineer imho. i think my ai study group would never need to expand. since this is faster to build ai since there are no necessary meetings to design design decisions and i design just quickly. if it were 2 engineers working,we woulkd have to elaborate/decide on technical design issues together, and that would be lots of meetings/lots of time. nah. i think its better to go study single person in building this ai nor when training ai with particle physics quantum mechanics information etc etc.
all that matters is time currently i mean the fastest this project is built the best/better. i mean this project is needed to be very urgently built. since i need the doctor ai to heal my veins illness whichever i dont know what illness it is in my veins and my digestive system. its like a race against death this ai project's schedule. i need to urgentlty build this ai cause i urgently need doctor ai to help me. yupp (since i am an extremely phobic person that has closed space phobia and so i have issues with entering current medical technologies like tomography due to being severely phobic person or so thats why doctor ai is urgently most urgently needed cause i have serious sickness in digestive system plus my veins visibly. so its very urgently needed to build this ai project. its very very very urgent. )
