This paper continues a recent line of theoretical work that seeks to explain what autoencoders learn about the data-generating distribution. Of practical importance from this work have been ways to sample from autoencoders. Specifically, this paper picks up where \cite{journals/jmlr/AlainB14} left off. That paper was able to show that autoencoders (under a number of conditions) estimate the score (derivative of the log-density) of the data-generating distribution in a way that was proportional to the difference between reconstruction and input. However, it was these conditions that limited this work: it only considered Gaussian corruption, it only applied to continuous inputs, it was proven for only squared error, and was valid only in the limit of small corruption. The current paper connects the autoencoder training procedure to the implicit estimation of the data-generating distribution for arbitrary corruption, arbitrary reconstruction loss, and can handle both discrete and continuous variables for non-infinitesimal corruption noise. Moreover, the paper presents a new training algorithm called "walkback" which estimates the same distribution as the "vanilla" denoising algorithm, but, as experimental evidence suggests, may do so in a more efficient way.