Can you explain what is required on 4.4 and 4.5?

Should I randomize the weights 10,000 times, and for each time run the simulation for 1,000 epochs?

Also, what is the point of the graph plot? The question says:

"Plot a histogram of number of episodes required until score 200 and report the

average number of episodes"

Since it is a random search, I don't see the point in plotting anything.. there isn't a learning curve, just some random trials.

I think I misunderstand something..

