Interesting, I’ve never needed 1M, or even 250k+ context. I’m usually under 100k per request.
About 80% of my code is AI-generated, with a controlled workflow using dev-chat.md and spec.md. I use Flash for code maps and auto-context, and GPT-4.5 or Opus for coding, all via API with a custom tool.
Gemini Pro and Flash have had 1M context for a long time, but even though I use Flash 3 a lot, and it’s awesome, I’ve never needed more than 200k.
For production coding, I use
- a code map strategy on a big repo. Per file: summary, when_to_use, public_types, public_functions. This is done per file and saved until the file changes. With a concurrency of 32, I can usually code-map a huge repo in minutes. (Typically Flash, cheap, fast, and with very good results)
- Then, auto context, but based on code lensing. Meaning auto context takes some globs that narrow the visibility of what the AI can see, and it uses the code map intersection to ask the AI for the proper files to put in context. (Typically Flash, cheap, relatively fast, and very good)
- Then, use a bigger model, GPT 5.4 or Opus 4.6, to do the work. At this point, context is typically between 30k and 80k max.
What I’ve found is that this process is surprisingly effective at getting a high-quality response in one shot. It keeps everything focused on what’s needed for the job.
Higher precision on the input typically leads to higher precision on the output. That’s still true with AI.
For context, 75% of my code is Rust, and the other 25% is TS/CSS for web UI.
Anyway, it’s always interesting to learn about different approaches. I’d love to understand the use case where 1M context is really useful.
While the parent is right in stressing the importance of the review phase, which is a filter to decide whether we will ever see a candidate paper or not, it is indeed the case, as the OP posits, that there is merit in extracting the context in which a citation is used, which has been termed "citation polarity".
For example, 'Chomsky (1969) was entirely wrong when he said that “it must be recognized that the notion of 'probability of a sentence' is an entirely useless one, under any known interpretation of this term".'
There are Natural Language Processing methods and tools to extract the citation polarity. Researchers like S. Teufel have used citation polarity and other methods to analyze publications ("rhetorical zoning") and to create maps of the published literature: https://www.cl.cam.ac.uk/~sht25/az.html
A scientific literature search engine (Semantic Scholar, CiteSeerX etc. - sadly Microsoft Academic has recently been discontinued) can benefit from such knowledge; for example, PageRank can be modified so as to incorporate citation polarity in the random walker model that underlies it. Think of it as adding a "dislike" button to a system that already offers a graph with "like" button relations.
I like the headline. Nobody really knows how the economy works. Economics is an attempt to explain how wealth and prosperity are created. That also means some branches are biased towards what we already have. It gets especially bad when morality is used because from the perspective of something as uncaring as an economy there are no meaningful morals, they are all made up. I personally think the vast majority of economic problems are caused by modeling errors. The model we use to look at the real economy is too rigid to represent it.
The laziest way to explain the rigidities is to just blame someone else, most of the time governments. However, getting rid of goverments doesn't seem to get rid of all rigidity because, as it turns out, our physical world is constrained by more than just politics. What's often forgotten is that governments can also fight against natural forces that reduce flexibility in the economy. Therefore the answer is neither more government or less government. No it's the usual boring answer. What we need is better governments. It's like a supply and demand situation. There is demand for a certain size of government and the supply is often either too high or too low. The goal is to find a balance.
About 80% of my code is AI-generated, with a controlled workflow using dev-chat.md and spec.md. I use Flash for code maps and auto-context, and GPT-4.5 or Opus for coding, all via API with a custom tool.
Gemini Pro and Flash have had 1M context for a long time, but even though I use Flash 3 a lot, and it’s awesome, I’ve never needed more than 200k.
For production coding, I use
- a code map strategy on a big repo. Per file: summary, when_to_use, public_types, public_functions. This is done per file and saved until the file changes. With a concurrency of 32, I can usually code-map a huge repo in minutes. (Typically Flash, cheap, fast, and with very good results)
- Then, auto context, but based on code lensing. Meaning auto context takes some globs that narrow the visibility of what the AI can see, and it uses the code map intersection to ask the AI for the proper files to put in context. (Typically Flash, cheap, relatively fast, and very good)
- Then, use a bigger model, GPT 5.4 or Opus 4.6, to do the work. At this point, context is typically between 30k and 80k max.
What I’ve found is that this process is surprisingly effective at getting a high-quality response in one shot. It keeps everything focused on what’s needed for the job.
Higher precision on the input typically leads to higher precision on the output. That’s still true with AI.
For context, 75% of my code is Rust, and the other 25% is TS/CSS for web UI.
Anyway, it’s always interesting to learn about different approaches. I’d love to understand the use case where 1M context is really useful.