a) In Big Data computing, proprietary distributed file systems which are optimised for maximising efficiency for a particular set of applications can be used.


The Google File System (GFS) cluster is an example of such a distributed file system.
With the aid of a diagram, explain the architecture of the GFS cluster and suggest TWO ways in which this architecture is optimal for some type of applications.

b) In a Big Data initiative, you might consider the use of Infrastructure as a Service (IaaS) and Platform as a Service (PaaS) from a public cloud provider. Explain what these services provide and give TWO advantages and TWO disadvantages you might expect from each service.

