I would like to open a discussion that would involve in changing the way that models are written.
I find that writing models using yaml is more cumbersome and confusing than it would be if we supported JS files only, instead of dealing with the overhead of compiling yaml files into js. Currently models.js is 501 lines long which makes it harder to maintain than it should be. This code is just to read each model file and convert it into a js object and loop through each key and convert functions that are strings into real js functions. I think that if we just supported JS files we could reduce this file to be 1/3 the size it is now.
Moving to a JS api would remove a lot of the magic that happens inside of the pre_*, post_*, build functions. Where we are passing in context, documents, globals, inputs, faker, chance, index, require to each of the functions. These things that are passed in all become global variables(aka magic) to the developer writing the function. The developer can't easily see what variables they have access to without reading the documentation very throughly which makes debugging hard. But even if they read the documentation thoroughly they are forced to use what ever name we gave each of the arguments and now have to remember those specific names we used like document_index.
Not using yaml would solve our problems of trying to use a library like worker-farm that uses child workers to spread out CPU intensive tasks to all the available CPUs on the machine. This means that we can run each model in a separate node instance which would improve performance tremendously. It would help with a common problem people are running into where their computers might lock up while they're trying to generate millions of records. The reason we can't use yaml along side of this is because of the way worker nodes work. To send in options into a worker node you have to send it in a string form which means we can't pass in our parsed model object full of functions in as an argument. If we used JS all we have to do is send in the path to the file and then inside of our worker node require it and execute it.
Another reason why it would be better to use JS only is that people can require a package like they normally would at the top of a file instead of the overhead of them requiring a package inside of build function that will get called thousands of times. It would also allow us to remove support for definitions how they exist today because they can now become a package/file that can be imported a the top of a file.
I would like to rethink how models are created, and how to use a chainable function library similar to joi to do this. I think this would make creating models easier and more intuitive.
Here's and example of my proposal for using JS to generate models
Example
import fakeit from 'fakeit'
import globby from 'globby'
export default fakeit
// all the top level options for the model would be declared inside of the options function
.options({
name: 'Countries',
key: '_id',
dependencies: [], // you could use `require` for dependencies, or pass in a string of the filepath
inputs: {
// so that this doesn't get off topic we would support an `objects` for inputs
countries: '../input/countries.json',
},
// several of our examples set the count dynamically based off of inputs or data from other dependencies
// so to me it makes sense to allow `min`, `max`, and `count` to be a number or a function.
// This will allow cleaner way to dynamically set the count.
// Note I wouldn't make these functions async capable.
count({ $inputs }) {
// inputs would be `$inputs` to note that it's a dynamic
return $inputs.countries.length
},
// `before` replaces `pre_run`
// `beforeEach` replaces `pre_build`
// `afterEach` replaces `post_build`
// `after` replaces `post_run`
// Using this terminology makes more sense because it lines up more with how testing libraries are setup
// which should make it easier for users to understand what it it
async before(context) {
// Anything that would be set on `context` would be local to this model, not other models in other files.
// This way there's no conflicts with other models overwriting variables
context.index = 0;
// for those that would be curious you would still be able to reference the count inside of a `before` function
context.$options.count = context.$inputs.countries.length;
},
beforeEach(context) {
// it would be nice to convert things like chance to become plugins and we could reference them with a `$`
// just like the other instances of dynamic things. This is also how `vue` handles plugins. We could also do
// `this.$plugins.chance`, but I don't think there's any benefit to that except you get to work on your typing skills more that you should.
context.index = context.$chance.integer({ min: 0, max: context.$inputs.countries.length - 1 })
}
})
// `type` would no longer be relevant because we can just use the `type` it's self, just like `joi`.
// just like with joi you would be able to do `object([schema])`, or `object.keys([schema])`
.object({
_id: fakeit
// `.after` would would replace post_build.
.after(() => `country_${fakeit.ref('gdp')}`)
// .string() // would convert the result to a string. We would no longer require a `type`
gdp: fakit
.build(({ $faker }) => $faker.random.number({ min: 1000, max: 75000 }))
.integer(), // would ensure it's an integer
phones: fakeit
.array()
.items(fakeit.object({
type: fakeit
.build(({ $faker }) => $faker.random.arrayElement([ 'Home', 'Work', 'Mobile', 'Main', 'Other' ])),
phone_number: fakeit
.build(({ $faker }) => $faker.phone.phoneNumber().replace(/\s*x[0-9]+$/, '')),
}))
// .length() would specify the specific length of the array
.min(2)
.max(20)
})
I think there's a lot of really neat stuff we can do using JS like this that would make the creation of models easy. If we decide that we still want to support yaml for some reason I would suggest making it a plugin for the app so the default behavior would be JS.
I would like to hear from several people in the field that use fakeit currently and see what their thoughts are on this to see if it's worth continuing support for yaml.
@bentonam @mistersender @tabrindle @brantburnett @alburdette619
I would like to open a discussion that would involve in changing the way that models are written.
I find that writing models using yaml is more cumbersome and confusing than it would be if we supported JS files only, instead of dealing with the overhead of compiling yaml files into js. Currently
models.jsis 501 lines long which makes it harder to maintain than it should be. This code is just to read each model file and convert it into a js object and loop through each key and convert functions that are strings into real js functions. I think that if we just supported JS files we could reduce this file to be 1/3 the size it is now.Moving to a JS api would remove a lot of the magic that happens inside of the
pre_*,post_*,buildfunctions. Where we are passing incontext,documents, globals, inputs, faker, chance, index, requireto each of the functions. These things that are passed in all become global variables(aka magic) to the developer writing the function. The developer can't easily see what variables they have access to without reading the documentation very throughly which makes debugging hard. But even if they read the documentation thoroughly they are forced to use what ever name we gave each of the arguments and now have to remember those specific names we used likedocument_index.Not using yaml would solve our problems of trying to use a library like
worker-farmthat uses child workers to spread out CPU intensive tasks to all the available CPUs on the machine. This means that we can run each model in a separate node instance which would improve performance tremendously. It would help with a common problem people are running into where their computers might lock up while they're trying to generate millions of records. The reason we can't use yaml along side of this is because of the way worker nodes work. To send in options into a worker node you have to send it in a string form which means we can't pass in our parsed model object full of functions in as an argument. If we used JS all we have to do is send in the path to the file and then inside of our worker node require it and execute it.Another reason why it would be better to use JS only is that people can require a package like they normally would at the top of a file instead of the overhead of them requiring a package inside of build function that will get called thousands of times. It would also allow us to remove support for definitions how they exist today because they can now become a package/file that can be imported a the top of a file.
I would like to rethink how models are created, and how to use a chainable function library similar to joi to do this. I think this would make creating models easier and more intuitive.
Here's and example of my proposal for using JS to generate models
Example
I think there's a lot of really neat stuff we can do using JS like this that would make the creation of models easy. If we decide that we still want to support yaml for some reason I would suggest making it a plugin for the app so the default behavior would be JS.
I would like to hear from several people in the field that use fakeit currently and see what their thoughts are on this to see if it's worth continuing support for yaml.
@bentonam @mistersender @tabrindle @brantburnett @alburdette619