|C Sockets - No Need For A Web Server!|
|Written by Mike James|
|Monday, 15 August 2016|
Page 1 of 3
While advising on how to put together a C data collection program, I was part of a conversation that suggested that to host a web page of results we need to install Apache. No way!
Sockets are a general purpose way of communicating over networks and similar infrastructure. Essentially they are a generalization of streams to things other than storage devices. The problem with sockets is that they are called "sockets", which sounds strange. In addition they are so general purpose that it can be difficult to see what you can do with them.
The case mentioned in the introduction was a C program that needed to send some data using the web - an HTML page or JSON data. At first sight this seemed to need a web server and so the programmers concerned were considering installing Apache. The system in question was a Raspberry Pi and could manage to run Apache, but in this case the whole solution was overkill.
It is very easy to implement a simple web server or a web client using sockets. This is what this article explains and along the way you will discover how versatile sockets are. You can use them to communicate using almost any standard protocol, like HTTP, or a custom protocol of your own devising. All sockets do is transport data from one point to another.
The basic steps in using a socket are fairly simple:
Sockets connect to other sockets by their addresses.
The simplest case is where there are just two sockets or two endpoints communicating. Once the connection is made the two sockets operate in more or less the same way. However, in general one of the sockets will have initiated the connection - the client - and the other will have accepted the connection - the server.
There is a conceptual difference between a client and a server socket. A server socket is setup and then it waits for clients to connect to it. A client socket actively seeks a connection with a server. Once connected, data can flow in both directions and the difference between the two ends of the connection becomes less.
The key idea is that a socket is implemented to make it look as much like a standard Linux file as possible. This conforms with a general principle of Linux that any I/O facility should follow the conventions of a file.
The basic socket functions that you need to know are:
Create a socket
This returns a socket descriptor an int which you use in other socket functions.
The socket_family is where you specify the type of communications link to be use and this is where sockets are most general. There are lots of communications methods that sockets can use including AF_UNIX or AF_LOCAL which don't use a network but allow intercommunication between processes on the same machine. In most cases you are going to be using AF_INET for IPv4 or AF_INET6 for IPv6 networking.
The socket_type specifies the general protocol to be used. In most cases you will use SOCK_STREAM which specifies a reliable two-way connection - for IP communications this means TCP/IP is used. For some applications you might want to use SOCK_DGRAM which specifies that the data should be sent without confirming that it has been received. This is a broadcast mechanism that corresponds to UDP for IP communications.
The protocol parameter selects a sub-protocol of the socket type. in most cases you can simply set it to zero.
As we are going to be working with sockets that basically work with the web we will use AF_INET and SOCK_STREAM.
Connect a socket to an address
To connect a socket as a client of another socket you need to use
The sockfd parameter is just the socket file descriptor returned from the socket function. The addr parameter points at a sockaddr struct which contains the address of the socket you want to connect to. Of course addrlen just specifies the size of the struct.
Socket address type depend on the underlying communications medium that the socket used but in most cases, and certainly in this article, it is just an IP address.
As addresses are used in many different socket function it is worth dealing with how to construct an address as a separate topic.
Bind a socket to an address
To assign a server socket an address it will respond to use:
The sockfd parameter is just the socket file descriptor returned from the socket function and addr is a pointer to an address struct.
Beginners often ask what the difference is between connect and bind. The answer should be obvious. Connect makes a connection to the socket with the specified address whereas bind makes the socket respond to that address. Put another way - use connect with a client socket and bind with a server socket.
Reading And Writing
There isn't anything much to say about sending and receiving data from an open socket because it is just a file and you can use the standard read and write functions that you would use to work with a file. Of course there are some differences and some additional features that you need to work with a network, but this is the general principle.
Listen and Accept
There is one small matter that we have to deal with that takes us beyond simple file use semantics. If you have opened a socket and bound it to an IP address then it is acting as a server socket and is ready to wait for a connection.
How do you know when there is a connection and how do you know when to read or write data?
Notice this problem doesn't arise with a client socket because it initiates the complete connection and sends and receives data when its ready.
sets the socket as an active server. From this point on it listens for the IP address it is bound to and accepts incoming connections. The backlog parameter sets how many pending connections will be queued for processing.
The actual processing of a connection is specified by the:
The accept command provides the address of the client trying to make the connection in the sockaddr structure. It also returns a new socket file descriptor to use to talk to the client. The original socket carries on operating as before. Notice that this is slightly more complicated than you might expect in that it is not the socket that you created that is used to communicate with the client. The socket you created just listens out for clients and creates a queue of pending requests. The accept function processes these requests and creates a new socket that is used to communicate with the client.
This still doesn't solve the problem of how the server detects that there are clients pending?
This is a complicated question with many different solutions.
You can set up the listening socket to be either blocking or non-blocking. If it is blocking then a call to accept will not return until there is a client ready to be processed. If it is non-blocking then a call to accept returns at once with an error code equal to EAGAIN or EWOULDBLOCK. So you can either use a blocking call or you can poll for clients to be ready. A more complex approach would be to use another thread to call the poll() function which performs wait with no CPU overhead while the file descriptor isn't ready.
More of this later.
A Web Client
We now have enough information to implement our first socket program - a web client. It has to be admitted that a web client isn't as common a requirement as a web server, but it is simpler and illustrates most of the points of using sockets to implement an HTTP transaction.
The first thing we have to do is create a socket and for the TCP needed for an HTTP transaction this is just:
To allow this to work you have to add:
Next we need to get the address of the server we want to connect to. For the web this would usually be done using a DNS lookup on a domain name. To make things simple we will skip the lookup and use a known IP address. Example.com is a domain name provided for use by examples and you can find its address by pinging it. At the time of writing it was hosted at:
This could change so check before concluding that "nothing works".
The address structure:
has three fields sin_family is just set to:
to indicate an internet IPv4 address. The next field is the port number of the IP adddress, but you can't simply use:
because the bit order used on the Internet isn't the same as used on most processors. So you have to use a utility function that will ensure the correct bit order:
The function name stands for host to network short and there are other similarly named functions.
|Last Updated ( Monday, 15 August 2016 )|