Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Let’s Build a Web Server. Part 1 (ruslanspivak.com)
131 points by kissgyorgy on April 3, 2015 | hide | past | favorite | 35 comments


Pretty much the same minimal http server in C:

    #include <WinSock.h>
    #include <stdio.h>
    #pragma comment(lib, "wsock32.lib")
    int main(int argc, const char *argv[]) {
        WSADATA wsadata;
        WSAStartup(2, &wsadata);
        sockaddr_in address;
        memset(&address, 0, sizeof(address));
        address.sin_family      = AF_INET;
        address.sin_addr.s_addr = inet_addr("0.0.0.0");
        address.sin_port        = htons(80);
        int sock = socket(PF_INET, SOCK_STREAM, IPPROTO_TCP);
        bind(sock, (struct sockaddr *)&address, sizeof(address));
        for(;;) {
            listen(sock, 0);
            int connection = accept(sock, NULL, NULL);
            char recvBuffer[1024];
            int recvSize = recv(connection, recvBuffer, sizeof(recvBuffer)-1, 0);
            recvBuffer[recvSize]=0;
            printf(recvBuffer);
            char response[] = "HTTP/1.1 200 OK\nContent-Type: text/html\n\nlol";
            send(connection, response, sizeof(response), 0);
            closesocket(connection);
        }
        return 0;
    }


> char response[] = "HTTP/1.1 200 OK\nContent-Type: text/html\n\nlol";

isnt it better to write that as:

const char* response = "HTTP/1.1 200 OK\nContent-Type: text/html\n\nlol"; ?

Having the string in a writable buffer when you have no intention of modifying the string seems like not a very good idea.

Any decent compiler will evaluate "strlen(response)" at compile time and hence there will be no run time penalty here.


> Having the string in a writable buffer when you have no intention of modifying the string seems like not a very good idea.

Any decent compiler will also notice this and put the string literal in the .RODATA section. Using "const" doesn't really make a difference, it's more of a hint for the compiler.

It really doesn't make any difference in this case.


It may be a hint to modern compilers, but it's also the documentation of intent for future developers[1].

[1] Where future developers includes me - I'll forget what I wrote in six months or less.


Not just as documentation, it would prevent another developer from accidentally modifying it.


Excellent point ... I kind of assumed that (your IDE giving you a warning that you're trying to modify a constant) when I was thinking about it as documentation but your point more succinctly describes "making it hard to make mistakes".


It does, the compiler doesn't know what the function "send" will do. The const isn't the focus of what he said, it's allocating in the stack vs in a read-only memory location.


This is more inline with what the blog's author should've done with all the grandiose intro of trying to understand the underlying system. Good work!


Why is for(;;) used instead of while(true)? I've seen this idiom used many times but never understood the reasoning why. According to this[0] SO, there are no performance gains, so why would one prefer the for version?

[0] http://stackoverflow.com/questions/2288856/when-implementing...


They are equivalent. It's just a matter of personal preference. I like it because it's shorter and it doesn't read as a loop that evaluates the value 'true' on each iteration. Even though you know the evaluation optimizes away trivially if you think about it, I don't even want to think that far when I'm scanning code.


I had a professor who gave some reasoning about this. He's written a paper: http://plg.uwaterloo.ca/~pabuhr/papers/MELoop.ps

I haven't read the whole thing but I like this bit:

> There are many algorithms that are best written using an exit from the middle of a loop. If not written in this fashion, techniques such as the traditional “priming” of a loop must be used. For example, in reading a file:

  read(a, b, c)
  WHILE NOT eof DO
    { process a, b, & c }
    read(a, b, c)
  END WHILE


  LOOP
    read(a, b, c)
    WHEN eof EXIT
    { process a, b, & c }
  END LOOP
> The example on the left [top] shows the priming of a loop. This requires duplication of the READ statement or, in other cases, creation of a subprogram to eliminate the duplication, both of which are undesirable. As well, loop priming can be done an arbitrary distance before the beginning of the loop making the program difficult to read as the priming statement(s) is critical to the understanding of the loop.


This wasn't really my question. I was asking about the different infinite loop idioms in C family languages and the rationale for using one over the other.


Because the original K&R book introduced for(;;) structure.


In the first versions of C, 'true' was not a keyword, so for(;;) was the way an endless loop was encoded. After that, it became the convention.


Why not just use while(1)?


Personal preference. Some people think "for(;;)" looks nicer.


the "for" version is also shorter by a single byte.


I just started learning Go last night, but I'll hazard a guess: since Go draws heavily from C, but has the for loop as its only looping construct, it seems likely that this habit has carried back into C code written by developers who've used Go, or who've been reading a lot C code written by developers who also use Go.


for(;;) existed way before Go came along.


I'm not saying it didn't, but 'while (true)' would be the more intuitive way to write that in any language besides Go.


Another reason is that MSVC whines about loop conditionals evaluating to constants. Or at least it used to. So while(1) would cause a warning, but not for(;;)


See this list of winsock alternatives if trying on non-windows http://tangentsoft.net/wskfaq/articles/othersys.html


Now if only someone could write "Let's build a database"...


Yes! Would love to read a post about writing a toy DB :)


MongoDB? :)


I said a toy DB, not hype machine ;)


Nice, demonstrating with an invalid HTTP/1.1 request. All HTTP/1.1 requests are required to include a Host: header. If he'd wanted to keep it simple, he should have used 1.0.

GET / HTTP/1.0

is perfectly valid.

This is just an advert for his book anyway.


1. As others have suggested, read W. Richard Steven's book (http://smile.amazon.com/Unix-Network-Programming-Sockets-Net...). 2. Read the HTTP spec. 3. Write a simple server, send requests through some commonly encountered firewalls, proxy servers, etc. while running Wireshark on the client and server sides and see what happens to your request and response data. One popular antivirus program used to (and maybe still does) rewrite HTTP headers.


I am building one, too. Compiled the application in C, and running it through xinetd. Works great. Fork-exec of C executables is faster than VM-interpreter based ones.


Author is a great writer. I wish I could read everything by Ruslan Spivak.


You can find more detail and technical info in Stevens' book "Advanced Programming in the Unix Environment"; the best book for maybe 25 years and helped me do the same thing about 10 years ago.


Buying that book has been one of the best decisions I have ever made, programming-wise. Stevens has a great writing style and manages to present the subject matter concisely without either talking down to his readers or being obscure.


Hardly consider using Python to help in "understanding of the underlying software systems". Tired of these pseudo-scientist that are seeking more "fame" than actually advancing the tech industry. YCombinator seems to get more of this pseudo-engineering upvoted. Time to go to slashdot.


If you have never done any network programming before, or even little programming experience at all, it is not a bad approach. Python is a nice language, and it has the advantages of being fairly readable and less error-prone than C.

Also, keep in mind that the author is working on a book on the subject, and this is most likely an excerpt, or at least a highly condensed version of what the books shows. Hopefully, the book would be a lot more detailed. (Furthermore, if you have written a web server before or are a long time Apache contributor or something like that, you are not the target audience.)

That being said, if I had never done any network programming before, I would probably still be slightly confused at the end, because I would not know anything about the socket API. As I have done network programming before, I found myself getting kind of bored quickly. And the really interesting stuff you'd need know for writing a realy web server is not even touched in this part ("Part 1" kind of implies there will be at least one more part) - how do you map URLs to replies? How do you send files or call code dynamically to generate responses (a simple CGI implementation might be interesting for study purposes)? How do you handle multiple concurrent requests?





Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: