This page gives some brief technical information about Robot Burns Infinite.
The base model is GPT-3, and I'm using the babbage
model, rather than davinci
, basically because it's much cheaper to run. I use the same original dataset of 6000 lines of Burns poetry that I used for the 2020 Robot Burns book, which used GPT-2.
The thing I've found to be most important is how the dataset is prepared, and the model has gone through several iterations. There are still some improvements I hope to make, for example the number of stanzas it generates, as well as how well it takes your themes as inspiration.
For the "themes" feature, I leveraged text-davinci-003
to analyse each poem in the dataset for its key themes. AI-bootstrapping!
For example, for the below original Burns stanza, it said the 5 main themes were "Mountains, snow, valleys, forests, floods"
.
Farewell to the mountains, high-coverd with snow, Farewell to the straths and green vallies below; Farewell to the forests and wild-hanging woods, Farewell to the torrents and loud-pouring floods. My hearts in the Highlands, etc. The Sun had closd the winter day, The curless quat their roarin play, And hungerd maukin taen her way, To kail-yards green, While faithless snaws ilk step betray Whare she has been.
I have explored a few different dataset structures:
I'm currently looking at doing multiple stanzas, but my dataset that I collected back in 2019 doesn't delineate between poems, it's just a big text file with lots of stanzas. Stanzas of the same poem are grouped together, but it's not represented where one poem ends and another begins
If you feel something here is incorrect, or you'd like to talk to me more about it, then feel free to contact me. My personal site is gibsonic.org.
Copyright ©2023. Robot Burns Infinite & Perry Gibson. All rights reserved.