MCP is just an API for LLM agents and lets them access resources in real time. Those resources can change over time and with MCP you can even give the LLM the ability to make those changes in a structured, constrained way.
When you train an LLM its weights are static at that point. The other thing is that if you take an open weight model and fine tune it or distill it then you can end up making it worse along some dimensions even if it ends up being better for your specific case. See this article on catastrophic forgetting:
https://cobusgreyling.medium.com/catastrophic-forgetting-in-llms-bf345760e6e2