TL’DR: Long Polling is When a client requests something from the server, and the data is not present at the moment, then the server doesn’t reply immediately instead, The server keeps the connection open. Once the requested data is available (or configured max waiting time is over) Respond and close the connection.
Now, let’s go about the subject a little deeper.
✋ Before Long Polling, Know About Polling
Let me start with the requirement. Let’s say I am building a Live Soccer Score Website. Users can see the latest score. On the client side, I will keep asking for updates on the scores at regular intervals, let’s say every 30 seconds. This is what polling is.
Polling or to sound more technical, HTTP Polling Request, is when you raise an HTTP request to the server in a regular interval. Yes, it is as simple as that.
Another variation of this is Short-Polling. In my understanding, this is just small intervals between requests, for example, 2 seconds. There is no official difference I could find between polling and short polling. I will interchangeably use polling and short-polling.
🔻 The Drawbacks of Short-Polling
The real bottleneck for the polling is the constant request to the server even when there is no data available. This increases network traffic leading to slow internet 😢. Not only that, each time it is creating a new request there is an additional overhead of opening a new TCP connection.
In my case for Soccer Score, I have thousands of users (yes, it’s a pretty popular website). We won’t be getting a goal every 30 seconds. So a lot of my requests will return an empty response and to make it worse, it’s happening from all of thousands of users.
Another drawback is, Not so live data i.e. delayed updates. Due to regular intervals, the result may get delayed at least by interval time. Shot-Polling also puts a load on the server due to frequent requests when there are large numbers of clients.
⭐️ Introducing The Long Polling Request
I know the bottleneck of design using the short-polling for my requirement, and now I want to resolve that. What if I ask the server, “Hey there, I am requesting the score for my team XYZ FC. But you don’t have to respond immediately. Give me data once it is available. I will wait“.
This is what the Long Polling is. In technical terms, in Long polling, the server doesn’t respond until the requested data is available and keeps the connection alive. It parks the request somewhere (implementation of this differs for each technology). Once the data is available, it picks up the parked request and returns the response back to the client. The waiting time for the data to be available can be configurable such as 60 seconds, 2 minutes, etc.
💪 Advantages of Long Polling
The server doesn’t get bombarded with requests even though the requested data is not available. A lot of resource savings. Server-side failure handling is better. Reduced lookups on data sources.
With additional optimization, similar requests in the waiting queue (parked) can be grouped and processed at once hence reducing the processing time.
As mentioned earlier waiting time for the server is not infinite, it is (and should be) configurable. For example, up to 60 Seconds. If the data is not available within 60 seconds of the request, just respond with no data, and clear the parked connections. let the client initiate the new request. This way, you are not clogging up the server with too many parked requests.
🔻 Cons of Long Polling
There are many challenges to doing the Long Polling right on the server side. One of the key challenges is parking the connection. Due to the main thread being blocked while waiting for the data, Async Programming is the preferred approach for long polling applications. As it can easily queue up the request and continue to serve other requests.
Another hurdle in Long Polling on browsers is the limit of active connections. Google Chrome, for example, has a limit of 6 parallel connections only.
The next minor drawback is that when the connection is broken, the server still has the parked request wasting the precious queue resources. Along with that, what if the user changes the network? How it should be handled is a separate topic on its own.
For Live Data, latency is Better than short polling but not quite live.
The next big challenge for Long Polling is the performance and scalability. How do you handle the surge in request traffic and how are the parked requests get affected by this? How are parked connections being processed, is it one by one, in groups? What’s the order? How the priorities are decided.
Yes, it’s challenging and complex to implement the Long Polling at scale.
⚾️ Which Pokémon to Choose Then?
Even though Pikachu is popular, Ash won’t and shouldn’t choose Pikachu in every fight. So the question which polling to use? And a universal answer to that is “It Depends”. Yes, it all comes down to the requirements we have.
If I had only a few hundred to thousand users at any given moment and the data didn’t need to be “live”, then short polling is the way, Why complicate things, right? But in my current use case, I have thousands of users. The data should be live but when will the team score a goal, I don’t know.
Instead of asking the server “Is it a goal yet? is it a goal yet?” I will probably choose the Long Polling here with let’s say some 60-second wait time. But this won’t be my final choice. Here is why.
🤔 Let me Think…
Long polling is not the only Pokémon I have. There are Server Side Events (SSE) and Websockets as options. Websockets, for example, can be used if I want instant updates to the score. It offers near real-time response due to its always-alive dual-way connection. This is the tech used in the instant chat application, like the one I already built using Spring. Take a look at it here 👉 Create Your Own Instant Messaging Chat Application With Spring
The correct way to choose my Pokémon 😉 would be to solidify my requirements, analyze the traffic pattern of my website, Have some benchmarking, the pros and cons of each approach, and then figure out the best system. I don’t want to use the new technology just for the sake of it, I have to justify its benefit in my billion-dollar Soccer Score Startup 😎.
Please Subscribe to the newsletter to know about the latest posts from CodersTea.