nprofile1qy2hwumn8ghj7un9d3shjtnddaehgu3wwp6kyqpqch8hfpny6tp95jqut4smklg3nx2h93dvvw6x4jsaqzwj8u6az9lq40qmju (nprofile…qmju) https://claude.ai/share/e75d2ad3-0ebf-4a63-a527-9ab5a96ef002
You have to get it to code, and there are different models. He doesn't say which models he used in either Gemini or Claude. For Google he should have used Google Studio. Although Google studio gets really bad fast but it's good at validating hallucinations from other models. It's all about the prompt engineering. Although it has it's runs of misleading information and it's not great at all problems. Different models for different situations. Although, the theory can be strong, if you don't get it to code it, and validate that the code ran, you can't trust it. Giving it specific problems, without sharing the prompt, is a bit misleading. I agree with you though. But I do think the models that code out the solution are a lot better at being consistent with their answers.
Also I read your paper on the Hall-Montgomery constant for n greater than or equal to five. What a trip. Then I spent all day trying to prove Hall-Montgomery wrong. Computational overload. I had no idea. Thought I had something, and learned a new lesson about computational sampling. Great job on making that argument airtight. I spent a little time trying to figure out why your constant exists for your open question. So far, dead end.