Just make it a LJM (Large JSON Model) capable of predicting the next JSON token from the previous JSON tokens and you would have massive savings in file storagre and network traffic from not having to store and transmit full JSON documents all in exchange for an "acceptable" error rate.
You need to make sure to remove excess whitespace from the JSON to speed up parsing. Have an AI read the JSON as plaintext and convert it to a handwriting-style image, then another one to use OCR to convert it back to text. Trailing whitespace will be removed.
My thoughts on software in general over the past 20 years. So many programs inefficiently written and in 4th level languages just eats up any CPU/memory gain. (Less soap box and more of a curious what if to how fast things would be if we still wrote highly optimized programs)
They get to this result on 0.6 MB of data (paper, page 5)
They even say:
Moreover, there is no need to evaluate our design with datasets larger than the ones we have used; we achieve steady state performance with our datasets
This requires an explanation. I do see the need - if you promise 100Gbps you need to process at least a few Tbs.
Personally, now that I have a machine capable of running the toolchains, I want to explore hardware accelerated compilation. Not all steps can be done in parallel but I bet a lot before linking can.
I have the same problem with XML too. Notepad++ has a plugin that can format a 50MB XML file in a few seconds. But my current client won't allow plugins installed. So I have to use VS Code, which chokes on anything bigger than what I could do myself manually if I was determined.
If you're not aware, it was called MB because of JEDEC when IEC units weren't invented. IEC units were introduced because they remove the double meaning of JEDEC units — decimal and binary. IEC units only carry the binary meaning, hence why they're superior. If you convert 1000 kB to 1 MB then use MB, but in case of 1024 KiB to 1 MiB you should be using MiB. It's all about getting the point across, and JEDEC units aren't good at it.
That's not how this works, GPUs are fast because the kind of work they do is embarrassingly parallel and they have hundreds of cores. Loading a json file is not something that can be trivially parallelized. Also, zed use the gpu for rendering, not reading files.
I'd like to point out for those who aren't in the weeds of silicon architecture, 'embarrassingly parellel' is the a type of computation work flow. It's just named that because the solution was an embarrassingly easy one.
As far as my understanding goes, Zed uses the GPU only for rendering things on screen. And from what I've heard, most editors do that. I don't understand why Zed uses that as a key marketing point