Personally, I don't think you actually need the robot part for the concept to be understandable and valid. But, it's a long debate but you do certainly have a point to some degree. Totally fair.
However, in the specific context I'm referring to - The Observer, or The Experiencer - I find Dennett's pointing to be useful. If we can start to comprehend how a brain comes to understand itself as having a mental self, and then developing behaviour from that pretext, we can start to see which aspects of that mental self have, and which aspects lack, validity in hard science terms.
Without this we end up like Chalmers or Tononi utterly convinced that there must be "someone who experiences consciousness." And we set out upon our magnum opus consciousness paper, starting from an untested core assumption that will scupper it before it's past the first paragraph.
The flaw you acknowledge in Dennett's argument means that he didn't actually succeed in avoiding an infinite regress. If the mere "clanking computer" (did he have any idea how computers work when he wrote that?) could generate autobiographical narrative without also generating a sense of self, then so could a brain. But that amounts pretty much to assuming the conclusion.
I think you need to distinguish between physical and mental selfhood here, and how the first realistically likely led to the latter. And then it's needed to break down mental selfhood into its varying aspects and ensure that they are fit for purpose.
What is the distinction between the physical self and the mental self? What part of the mental self is not physical? Since we're talking about strict materialism here, the answer is obvious. The mental must be a subset of the physical, and all mental processes are therefore part of the organism. Either part of its form, or part of its functioning.
Instead of focusing on mental selfhood, consider mental models of the world. How they might work, what they might include, and how they might be useful to maintaining the existence of an organism.
Consider, for instance, a jellyfish that (in order to thrive in its environment) must swim upward to shallower water at night and downward into deeper water in the daytime. Would the program for this mechanism require, or benefit, from a world model that included the self? Not really. "Undulating motions move me through a stratified space toward where conditions are better" is a possible model that includes a self, but the simpler model "undulating motions alter the conditions to make them better" works just as well.
The same might be true of, say, a turtle that hatches on the beach and has to crawl to the water to survive. "Move so as to make the water closer and danger-things farther away" might work as a model, but a model that actually included spatial positions things (and therefore, necessarily, a concept or "state variable" of self position) might also be enough of an improvement to be worth the neural overhead.
Now consider a mother bird that must collect food and bring it back to her hatchlings for them to have a chance to survive. This requires, obviously, a much more complex world model. She must find food, be aware of other birds (try to feed on what they're feeding on, but try not to let them feed on what you're feeding on), be aware of other hatchlings not her own (don't feed those!), and navigate around. A self-less world model (for instance, a massive memory table matching a vast number of possible long sequences of wing movements with the resulting changes in conditions, expanding upon the jellyfish technique) would become far too unwieldy. The overhead of a self inclusive model with elements such as self position and self motion is not only worth it but necessary. The more complex the model of the world becomes, the more the model of the self arises as figure versus ground. And this is before we add such cognitive elements as memory, planning, or language.