Hmm, it’s more subtil. Let me try to explain.
First of all, you need to know that Cloud Run is serverless, you pay only when a request is processed. Therefore, when a request is not processed you pay nothing, and Google not charge you for that. So, Google struggle the CPU of the instance that don’t process the request to use it for other services deployed on Cloud Run.
Now, take your use case. A request comes in. An instance is started and an answer, 404 is returned, let’s say in 1 second. The answer is returned, there is no longer request being processed, the CPU is throttled. So now, only a low % of CPU is allowed to your instance (about 5%), and therefore it takes long, long time to finalise the server startup.
In addition, don’t forget that your workstation have 4, 8 or more CPUs, cloud run has only 1 (by default). Try to increase the number of CPU to speed up the process. You CPU also run at 3.5 or 4Ghz on your workstation, and only at 2.6 ot 2.8Ghz in the cloud. finally, CPU can keep in cache (L2 or L3) your previous execution and then run faster the next one. On the cloud, it’s new and all the context need to loaded, no prefecth.
All these differences are to take into account when you compare your local execution and the Cloud Execution