Oh no, I missed another blog post this week! (This is being written after the fact.) Fear not, though, for I have not stopped work on my podcast-/playlist-listening app!
This week I implemented a dramatically improved instant-listening feature. Previously, when switching to a new episode, there would be a nontrivial delay before the episode would start playing. Now, it starts playing instantly!
It's still not perfect as I've been playing with it, but it's much better. In particular, it doesn't handle switching episodes quickly, because it's still loading an entire episode in the background while it's feeding up short segments. I have ideas to write a nice little algorithm to load up chunks in a more intelligent fashion, however. To understand the improvement, we first need to understand how it works now after the updates:
- When the first request for a given episode comes in for an audio segment, load only that segment and feed it up.
- Start a background process that begins loading the entire episode.
- If additional requests for more segments come in while the whole episode is still in the process of loading, then make additional single-segment reads while the episode is still loading.
- Finally, when the whole episode is done loading into RAM, feed segments up from memory.
There are a few scenarios this doesn't handle well:
- Switching to an episode and then scrobbling through the episode multiple times right away.
- Switching back and forth between episodes.
- Clients that request a large number of segments upfront.
In all these cases, it bogs down the server with requests for individual segments which each individually have to context switch to reading only a specific segment of the file, all stacked on top of each other. This is very painful to the end user as the server stops responding quickly.
The new process that I'll eventually write goes something like this:
- When a request for a given episode comes in, load that segment and feed it up immediately. Keep track of this request in a data structure but don't do any additional loading (note that at this point some clients such as iOS/Safari will have already sent in 5+ requests at a time).
- When we see the second request for a following contiguous segment, we see that we just had a request for the previous segment. Instead of loading the entire episode or only one segment, load the following 2 segments into the cache and serve up the next 1 segment (the request for the following segment that we're loading from file is likely already in the pipeline, so at least now we'll already have that in RAM and don't need to issue another file read).
- When the next request comes in for a segment just beyond what we just loaded, we see that they are more contiguous blocks. The first time, we loaded 1 segment; the second time, we loaded 2 segments; now load 4 segments.
- Each time this process occurs, we load 2x the number of segments as we loaded the last time. The assumption is that if the client keeps requesting segments from the same episode then they will likely continue to do so, but by not loading the entire episode all at once, we leave the server ready to make requests for segments from other episodes.
This still isn't perfect, but would be another huge improvement over the client response speed.
I'm also considering talking about non-development concepts in my blog posts, and I already renamed it from Game Week to Dev Week, but I'm thinking I'll have to rename it again, to Tech Week. There are plenty of times when listening to Security Now that I think I might like to talk about a security/privacy issue instead of development, especially on slower development weeks.
Thanks for joining me for another Tech Week! 🙂