|   Tracing 
                          it! 
                        Recently we launched an investigation 
                          of spam email and I promised you that we would talk 
                          about tracking e-mails to their senders.  
                          Well, 
                          every email message consists of two parts, the body 
                          and the header. The header can be thought of as the 
                          envelope of the message, containing the address of the 
                          sender, the recipient, the subject and other information. 
                          The body contains the actual text and the attachments. 
                          Some header information usually displayed by your email 
                          programme includes:  
                         From: - The sender's name and email 
                          address.  
                          To: - The recipient's name and email address.  
                          Date: - The date when the message was sent.  
                          Subject: - The subject line.  
                         The actual delivery of emails does 
                          not depend on any of these headers, they are just convenience. 
                          Usually, the ‘From’ line, for example, will 
                          be set to the sender's address. This lets you know who 
                          the message is from and can reply easily. Spammers want 
                          to make sure you cannot reply easily, and certainly 
                          don't want you to know who they are. So they insert 
                          false email addresses in the ‘From’ lines 
                          of their junk messages.  
                         So the ‘From’ line is 
                          useless if we want to determine the real source of an 
                          email. Fortunately, we need not rely on it. The headers 
                          of every email message also contain ‘Received’ 
                          lines. These are not usually displayed by email programs 
                          by default, but they can be very helpful in tracing 
                          spam.  
                         Just like a postal letter will go 
                          through a number of post offices on its way from sender 
                          to recipient, an email message is processed and forwarded 
                          by several mail servers.  
                         Imagine every post office putting 
                          a special stamp on each letter. The stamp would say 
                          exactly when the letter was received, where it came 
                          from and where it was forwarded to by the post office. 
                          If you got the letter, you could determine the exact 
                          path taken by the letter. This is exactly what happens 
                          with E-mail.  
                         As a mail server processes a message, 
                          it adds a special line, the ‘Received’ line 
                          to the message's header. The ‘Received’ 
                          line contains, most interestingly,  
                         aThe server name and IP address of 
                          the machine the server received the message from and 
                          aThe name of the mail server itself.  
                         The ‘Received’ line is 
                          always inserted at the top of the message headers. If 
                          we want to reconstruct an e-mail's journey from sender 
                          to recipient we also start at the topmost ‘Received’ 
                          line and work our way down until we have arrived at 
                          the last one, which is where the email originated. 
                         Spammers know that we will apply exactly 
                          this procedure to uncover their whereabouts. So, to 
                          fool us, they may insert forged ‘Received’ 
                          lines that point to somebody else sending the message. 
                         
                         Since every mail server will always 
                          put its ‘Received’ line at the top, the 
                          spammer's forged headers can only be at the bottom of 
                          the ‘Received’ line chain. This is why we 
                          start our analysis at the top and don't just derive 
                          the point where an email originated from the first ‘Received’ 
                          line (at the bottom).  
                         The forged ‘Received’ 
                          lines inserted by spammers to fool us will look like 
                          all the other ‘Received’ lines (unless they 
                          make an obvious mistake, of course). By itself, you 
                          can't tell a forged ‘Received’ line from 
                          a genuine one. This is where one distinct feature of 
                          ‘Received’ lines come into play. As we've 
                          noted above, every server will not only note who it 
                          is but also where it got the message from (in IP address 
                          form).  
                         We simply compare who a server claims 
                          to be, with what the server one notch up in the chain 
                          says it really is. If the two don't match, the earlier 
                          ‘Received’ line has been forged. In this 
                          case, the origin of the email is what the server immediately 
                          after the forged ‘Received’ line has to 
                          say about who it got the message from.  
                          Now 
                          that we know how emails work in theory, let's see how 
                          analysing a junk email to identify its origin works 
                          in real life.  
                         I've just received an exemplary piece 
                          of spam that we can use for exercise. Here are the header 
                          lines:  
                         Received: from unknown (HELO 38.118.132.100) 
                          (62.105.106.207) by mail1.infinology.com with SMTP; 
                          16 Nov 2003 19:50:37 -0000 
                          Received: from [235.16.47.37] by 38.118.132.100 id <5416176-86323>; 
                          Sun, 16 Nov 2003 13:38:22 -0600 
                          Message-ID: <o7-89089$t--2-370--h6b1@y07l72.olpvl> 
                          From: "Reinaldo Gilliam"<27knxeppzk@yahoo.com> 
                          Reply-To: "Reinaldo Gilliam" <27knxeppzk@yahoo.com> 
                          To: ladedu@ladedu.com 
                          Subject: Category A Get the meds u need lgvkalfnqnh 
                          bbk 
                          Date: Sun, 16 Nov 2003 13:38:22 GMT  
                          X-Mailer: Internet Mail Service (5.5.2650.21) 
                          MIME-Version: 1.0 
                          Content-Type: multipart/alternative; boundary="9B_9.._C_2EA.0DD_23"X-Priority: 
                          3 
                          X-MSMail-Priority: Normal  
                         First, take a look at the - forged 
                          - ‘From’ line.  
                         The spammer wants to make it look 
                          as if the message was sent from a Yahoo! Mail account. 
                          Together with the ‘Reply-To’ line, this 
                          ‘From’ address is aimed at directing all 
                          bouncing messages and angry replies to a non-existing 
                          Yahoo! Mail account.  
                         Next, the ‘Subject’ is 
                          a curious agglomeration of random characters. It is 
                          barely legible and obviously designed to fool spam filters 
                          (every message gets a slightly different set of random 
                          characters), but it is also quite skilfully crafted 
                          to get the message across in spite of this.  
                         Finally, the ‘Received’ 
                          lines. Let's begin with the oldest, ‘Received’ 
                          from [235.16.47.37] by 38.118.132.100 id <5416176-86323>; 
                          Sun, 16 Nov 2003 13:38:22 -0600.  
                         There are no host names in it, but 
                          two IP addresses: 38.118.132.100 claims to have received 
                          the message from 235.16.47.37. If this is correct, 235.16.47.37 
                          is where the email originated, and we'd find out which 
                          ISP this IP address belongs to, then send an abuse report 
                          to them.  
                         Let's see if the next (and in this 
                          case last) server in the chain confirms the first ‘Received’ 
                          line's claims: ‘Received’ from unknown (HELO 
                          38.118.142.100) (62.105.106.207) by mail1.infinology.com 
                          with SMTP; 16 Nov 2003 19:50:37 -0000. Since mail1.infinology.com 
                          is the last server in the chain and indeed ‘my’ 
                          server, I know that I can trust it. It has received 
                          the message from an ‘unknown’ host that 
                          claimed to have the IP address 38.118.132.100 (using 
                          the SMTP HELO command). So far, this is in line with 
                          what the previous ‘Received’ line said. 
                         
                         Now let's see where my mail server 
                          did get the message from. To find out, we take a look 
                          at the IP address in brackets immediately before by 
                          mail1.infinology.com. This is the IP address the connection 
                          was established from, and it is not 38.118.132.100. 
                          No, 62.105.106.207 is where this piece of junk mail 
                          was sent from.  
                         With this information, you can now 
                          identify the spammer's ISP and report the unsolicited 
                          email to them so they can kick the spammer off the net. 
                          
                         |