Generate code from a UI screenshot.
_Code:_ [Demo](https://youtu.be/pqKeXkhFA3I) and [code](https://github.com/tonybeltramelli/pix2code) to come.
## Inner-workings:
Decomposed the problem in three steps:
1. a computer vision problem of understanding the given scene and inferring the objects present, their identities, positions, and poses.
2. a language modeling problem of understanding computer code and generating syntactically and semantically correct samples.
3. use the solutions to both previous sub-problems by exploiting the latent variables inferred from scene understanding to generate corresponding textual descriptions of the objects represented by these variables.
They also introduce a Domain Specific Languages (DSL) for modeling purposes.
## Architecture:
* Vision model: usual AlexNet-like architecture
* Language model: use onehot encoding for the words in the DSL vocabulary which is then fed into a LSTM
* Combined model: LSTM too.
[![screen shot 2017-06-16 at 11 34 28 am](https://user-images.githubusercontent.com/17261080/27221124-c9cadcc6-5287-11e7-9d38-c4234af92912.png)](https://user-images.githubusercontent.com/17261080/27221124-c9cadcc6-5287-11e7-9d38-c4234af92912.png)
## Results:
Clearly not ready for any serious use but promising results!
[![screen shot 2017-06-16 at 11 57 45 am](https://user-images.githubusercontent.com/17261080/27222031-0bf8e7de-528b-11e7-896f-cdb410f928c3.png)](https://user-images.githubusercontent.com/17261080/27222031-0bf8e7de-528b-11e7-896f-cdb410f928c3.png)