Difference between Cloud Computing and Distributed Computing

Difference between Cloud Computing and Distributed Computing

When it comes to computing, there are two terms that are often mentioned: “cloud computing” and “distributed computing”. Although these two terms seem to be similar to each other, they have differences that must be understood, especially for those who are dealing with cloud or distributed systems. In this article, let us discuss the difference between cloud and distributed computing.

What is Cloud Computing?

Cloud computing refers to the delivery of computing services over the internet to a remote server rather than a local server or a personal computer. It allows organizations and individuals to access a shared pool of computing resources, such as servers, storage devices, databases, and software applications, without having to build and maintain their own IT infrastructure. In short, cloud computing enables users to access IT resources on-demand, anytime, and anywhere with an internet connection.

Here is a sample code in Python that demonstrates how to connect to a cloud-based database and retrieve some data:

import psycopg2
try:
    connection = psycopg2.connect(
        user="user",
        password="password",
        host="example.com",
        port="5432",
        database="mydatabase"
    )
    cursor = connection.cursor()
    cursor.execute("SELECT * from customers")
    customers = cursor.fetchall()
    for customer in customers:
        print(customer)
except (Exception, psycopg2.Error) as error:
    print("Error while connecting to PostgreSQL", error)
finally:
    if connection:
        cursor.close()
        connection.close()
        print("PostgreSQL connection is closed")

In this code snippet, we are using the psycopg2 library to connect to a PostgreSQL database hosted on a cloud server. We then execute a simple SQL query to select all records from the customers table and display them on the console.

What is Distributed Computing?

Distributed computing, on the other hand, refers to a model of computing that involves multiple computers connected to each other and working together to solve a complex problem. In distributed computing, tasks are divided into smaller sub-tasks, and each sub-task is run on a separate computer. The results from each computer are collected and combined to produce the final result.

Here is a sample code in Java that demonstrates how to implement a distributed computing system using the Apache Hadoop framework:

import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.io.IntWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapreduce.Job;
import org.apache.hadoop.mapreduce.Mapper;
import org.apache.hadoop.mapreduce.Reducer;
import org.apache.hadoop.mapreduce.lib.input.FileInputFormat;
import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat;

import java.io.IOException;
import java.util.StringTokenizer;

public class WordCount {
    public static class TokenizerMapper extends Mapper<Object, Text, Text, IntWritable>{
        private final static IntWritable one = new IntWritable(1);
        private Text word = new Text();

        public void map(Object key, Text value, Context context) throws IOException, InterruptedException {
            StringTokenizer itr = new StringTokenizer(value.toString());
            while (itr.hasMoreTokens()) {
                word.set(itr.nextToken());
                context.write(word, one);
            }
        }
    }

    public static class IntSumReducer extends Reducer<Text,IntWritable,Text,IntWritable> {
        private IntWritable result = new IntWritable();

        public void reduce(Text key, Iterable<IntWritable> values, Context context) throws IOException, InterruptedException {
            int sum = 0;
            for (IntWritable val : values) {
                sum += val.get();
            }
            result.set(sum);
            context.write(key, result);
        }
    }

    public static void main(String[] args) throws Exception {
        Configuration conf = new Configuration();
        Job job = Job.getInstance(conf, "word count");
        job.setJarByClass(WordCount.class);
        job.setMapperClass(TokenizerMapper.class);
        job.setCombinerClass(IntSumReducer.class);
        job.setReducerClass(IntSumReducer.class);
        job.setOutputKeyClass(Text.class);
        job.setOutputValueClass(IntWritable.class);
        FileInputFormat.addInputPath(job, new Path(args[0]));
        FileOutputFormat.setOutputPath(job, new Path(args[1]));
        System.exit(job.waitForCompletion(true) ? 0 : 1);
    }
}

In this code snippet, we are using the Apache Hadoop framework to perform a word count analysis on a text file. We define two classes: TokenizerMapper and IntSumReducer, which are responsible for mapping and reducing the input data, respectively. We also set up a job configuration that specifies the input and output paths for the data.

Differences between Cloud Computing and Distributed Computing

Now that we have discussed what cloud computing and distributed computing are and have seen some sample codes, it’s time to highlight the differences between these two models of computing. Here are the main differences:

Architecture

Cloud computing follows a central server-client model, where a central server provides all the computing resources to clients that need them. Clients access the resources from the central server, which is responsible for managing the resources and ensuring their availability.

Distributed computing, on the other hand, follows a peer-to-peer model, where multiple computers work together to achieve a common goal. Each computer acts as a peer, and there is no central server that provides computing resources or manages the infrastructure. Instead, each peer contributes its resources to the system, and the computing workload is shared among the peers.

Resource Management

In cloud computing, the central server is responsible for managing the resources and ensuring their availability to the clients. This includes activities such as provisioning, maintenance, scaling, and security. Users of cloud computing pay for the amount of resources they use, and they do not have to worry about managing the infrastructure themselves.

In distributed computing, each peer is responsible for managing its resources and ensuring their availability to the system. This includes activities such as load balancing, fault tolerance, and security. Users of distributed computing have to manage the infrastructure themselves, and they have to ensure that their peers are working together effectively.

Scalability

Cloud computing is highly scalable, as the central server can add or remove resources quickly and easily. This makes cloud computing ideal for situations where the workload is unpredictable or fluctuates frequently. Users can add or remove resources on-demand as needed, and they only pay for what they use.

Distributed computing is less scalable, as the number of peers is typically fixed or limited. Adding or removing peers can be difficult, especially if the system is already running. This makes distributed computing ideal for situations where the workload is predictable or static.

Reliability

Cloud computing is highly reliable, as the central server is typically equipped with redundant hardware and backup systems. This ensures that the system is always available, even if there are hardware failures or other disruptions.

Distributed computing is less reliable, as each peer is responsible for its own reliability. If one peer fails, the system may continue to run with the remaining peers, but the overall performance may suffer.

Conclusion

In summary, cloud computing and distributed computing are two different models of computing that have their own advantages and disadvantages. Cloud computing is ideal for situations where the workload is unpredictable or fluctuates frequently, while distributed computing is ideal for situations where the workload is predictable or static. Users of cloud computing can take advantage of the central server’s scalability and reliability, while users of distributed computing have to manage the infrastructure themselves.

Like(0)