Notes from the 2nd CALI Tech Briefing.
The briefing was focused on AI and CALI’s work in that space. I talked about building a small BERT model following this blog article and the code from this jupyter notebook. I used a custom built PC to run the process. The newer, more powerful GTX 4070 ti GPU completed the pre-training of the model in just over 44 hours, down significantly from the 100 hours mentioned in the post. I suspect I could shorten this time by a few more hours if I spent more time optimizing the code.
The result was a small pre-trained BERT model intended to be used in a mask situation, think fill in the blank. Testing shows that it works fine but it is limited in its scope. Future plans include working with this model to see what sort of capabilities are possible and training such a model