Open Source Intelligence (OSINT) is the practice of collecting information from published or publicly available sources for intelligence purposes. . The term ‘Open Source’ within Open Source Intelligence refers to the public nature of the analyzed data; publicly available information includes blogs, forums, social media sites, traditional media (TV, radio, and publications), research papers, government records, and academic journals. The scope of this information is almost infinite, concerning various people, companies, and organizations. Individuals who leverage Open Source Intelligence can span from IT security professionals and state-sanctioned intelligence operatives with ethical intentions to malicious hackers with unethical intentions. Understanding The History of Open Source Intelligence The history of Open Source Intelligence dates back to the emergence of intelligence to support a government’s decisions and actions. However, it was not used in a systematic way until the United States established the Foreign Broadcast Monitoring Service (FBMS) in response to the Japanese attack on Pearl Harbor. In 1947, it was renamed the Foreign Broadcast Intelligence Service (FBIS) under the newly established CIA. In 2005, following the 9/11 attacks and the passage of the Intelligence Reform and Terrorism Prevention Act, FBIS - with other research elements - was transformed into the Director of National Intelligence's Open Source Center (OSC). Since its establishment, the OSINT effort has been responsible for filtering, transcribing, translating/interpreting, and archiving news items and information from many foreign media sources. What Role Does Open Source Intelligence Play in Different Industries? OSINT is essential for many fields, such as law enforcement, risk and fraud management, human resources, cybersecurity, and military operations. It can be used to identify data breaches, uncover vulnerabilities, back up decision-making processes, aid customer due diligence, or help users stayupdated. In business, OSINT can be used for penetration testing, breach detection, ethical hacking, and chatter monitoring. Using OSINT is also crucial when keeping tabs on vast amounts of information. Information technology users using OSINT often target three essential tasks: discovering public-facing assets, discovering relevant information outside the organization, and collecting and grouping discovered information into an actionable form. By finding public-facing assets using OSINT, IT professionals can find information that anyone can find on or about a company's assets without resorting to unethical means such as hijacking. Using OSINT to discover relevant information outside an organization helps IT professionals expand from exploring only tightly defined networks, thus increasing their scope of discovery. Using OSINT tools to help collect and group this discovered information helps shape this information into more valuable and actionable intelligence. Within fraud detection and prevention, OSINT can be used as manual review support for anti-fraud systems. For instance, if an anti-fraud system’s ruleset was insufficient to assess the case correctly, OSINT can be used as a backup assessment. OSINT can also search carder forums or the dark web to see what information is trending and what professionals should prepare for. What Techniques Are Used in Open Source Intelligence? OSINT reconnaissance involves using publicly available resources to gather information on a person or organization. OSINT reconnaissance techniques fall into three categories: passive, semi-passive, and active. Passive reconnaissance often involves searching the web using applications such as search engines. This reconnaissance method is hard to detect since no direct engagement is involved, and only archived information is collected. Semi-passive reconnaissance usually consists of searching the web to find data, but can also utilize software solutions to non-intrusively gather information. Active reconnaissance is when data iscollected directly from the target, offering more accurate and timely information. This type of probing can be detectable. The best reconnaissance technique is dependent on the organizational needs of a team. However, following a general process helps lay the foundations for effective intelligence gathering. The Open Web Application Security Project (OWASP) outlines this 5-step OSINT process. This process begins with source identification, where we can find the information for the specific intelligence requirement. Next comes harvesting, collecting relevant information from the identified source. Data processing deals with processing the identified source’s data and extracting meaningful insights. The analysis step combines the processed data from multiple sources. Reporting is the last step, creating a final report on the findings. Using OSINT investigative skills , such as identifying visual clues in photos (e.g., terrain, architecture, shadows, street signs) and leveraging tools like Google Earth or reverse image search, investigators can geolocate images effectively to uncover critical insights, enhancing their OSINT investigative expertise. What Types of Open Source Intelligence Tools Exist? OSINT tools can be divided into three main categories. Discovery tools are used to search for any information that might be found on the web. Good discovery tools can be as simple as search engines. Scraping tools ensure only the required information is filtered through for extraction to a database. Scraping tools are helpful in hiding the presence of bulky data transfers and preventing irrelevant information from mixing with relevant information. Aggregation tools help combine related information from scraping tools to display a clearer picture of what the data represents, all in a presentable format. These can be instances of relations and connections between datasets. There are many free and paid open source intelligence tools available for a variety of purposes, such as searching metadata andcode, researching phone numbers, investigating identities, verifying email addresses, analyzing images, detecting wireless networks, and analyzing packets. However, some of these tools are limited by a paywall. Here is a list of the latest open-source intelligence tools that are free and can be used to their full potential: Nmap Scraping Tool Nmap (Network Mapper) is a free, open-source tool for vulnerability checking , port scanning, and network mapping. It allows you to scan your network and discover everything connected to it, and a wide variety of information about what’s connected and other valuable information. At its heart lies port scanning, which is helpful for administrators. Nmap utilizes a large number of scanning techniques, such as UDP, TCP connect (), TCP SYN (half-open), and FTP. It also offers various scan types such as Proxy (bounce attack), Reverse-ident, ICMP (ping sweep), FIN, ACK sweep, Xmas, SYN sweep, IP Protocol, and Null scan. Nmap can also do limited deployments of network port scans or scheduled network port scans, which is helpful since massive port scans would likely trigger security alerts by the target. Users can control the depth of each scan with light or limited scans for information regarding the port status or more detailed scans for relaying information about the operating systems using these ports. Nmap can do operating system detection via TCP/IP fingerprinting, stealth scanning, dynamic delay and retransmission calculations, parallel scanning, detection of down hosts via parallel pings, decoy scanning, port filtering detection, direct (non-portmapper) RPC scanning, fragmentation scanning, and flexible target and port specification. These qualities make Nmap very versatile. Previously, controlling these scans used to require training in console commands. However, with the new Zenmap graphical interface , experienced admins can more easily use commands to help them identify a target. This makes Nmap a helpful tool for experts and professionals involved inpenetration testing. However, the tool is still very technical and not recommended for novice users. Use Scenario: A user wants to use Nmap to identify a host’s operating system. They want to identify the host’s operating system because they are performing an inventory sweep of their network and want to identify any older assets. The user uses the- A switch to determine the OS for a remote system. For example, running: $ nmap -A localhost. yields an output that says the host is running Linux 3.7 - 3.9. Using Nmap, the user could identify that the host was running a deprecated operating system. Wireshark Scraping Tool A packet analyzer tool, Wireshark, effectively lets users put their network traffic under a microscope, allowing them to zoom in on the root cause of a particular problem. Wireshark captures network traffic on local networks such as Ethernet, Bluetooth, Wireless (IEEE.802.11), Token Ring, etc (packet capture). It then breaks the packets of these local networks down (filtering) before storing the data from these packets for purposes such as offline analysis (visualization). Wireshark has many uses within the industry, such as network analysis and network security. For instance, network administrators may use Wireshark to troubleshoot network problems, while network security engineers may use Wireshark to examine security problems. Quality assurance engineers may use Wireshark to verify network applications, while developers may use it to debug protocol implementations. Beyond these uses in the industry, Wireshark can also be used as a learning tool. Those new to information security can use Wireshark to understand network traffic analysis, how communication occurs when particular protocols are involved, and where it goes wrong when certain issues present themselves. Wireshark can also help novice users learn more about network protocol internals, such as those concerning TCP/IP. However, to properly use Wireshark, a user should first learn exactly how a network operates,such as understanding the three-way TCP handshake and various protocols, including TCP, UDP, DHCP, and ICMP. Use Scenario: A user has an issue with their home network; their internet connection is very slow. Using Wireshark, the user drills down into a packet to identify a network problem. They discovered quickly that their router thought a common destination (Youtube) was unreachable using the Wireshark interface. The issue was easy to find since Wireshark’s interface marks any packet in black to reflect an issue. Once realizing this, the user restarts the cable modem to fix the problem. GHunt Discovery Tool This OSINT tool allows users to analyze a target’s Google history based on factors such as a Gmail address. From a Gmail address, GH unt can extract the target’s name, Google ID, Youtube account, and active Google services. GHunt can also discover a target’s phone model and make, firmware and installed software, public photos, and even the target’s physical location with the right data. Within the industry, white hat hackers and penetration testers may use Ghunt to test whether the emails they find are reasonable and whether they can leak other information. However, they can also be used for threat hunting to identify and track threats. This tool can also be used to understand the extent of a user’s or business’s internet footprint. These qualities make GHunt a great threat intelligence collection and attack simulation tool. Use Scenario: A user’s friend has been receiving strange messages from a “secret admirer” through their email. These messages contain statements that make them feel uncomfortable. The user decides to find the identity of this “secret admirer,” but cannot find their name from the Gmail address alone. The user chooses to use GHunt to investigate their Gmail account. By typing: $ python3 hunt.py
Most of the papers deal with the potential gains a honeypot can give you, and the proper way to monitor a honeypot. Not very many of them deal with the honeypots themselves.. Honeypots are a hot topic in the security research community right now. It seems everyone is starting up their own honeypot system. Most of the papers deal with the potential gains a honeypot can give you, and the proper way to monitor a honeypot. Not very many of them deal with the honeypots themselves. Most honeypots as deployed as just an extra box someone has lying around. They slapped an OS on it, checksummed all the files, installed an IDS, and set about waiting for the hackers to arrive. Those kinds of honeypots ignore some of the most interesting parts of what a honeypot can do. Honeypots can be used to ensnare and beguile potential hackers; entice them to give you more research information, and actively defend your production network. We decided to write down some of what we think is cool and fun to do with honeypots. These techniques can be used to create an environment that keeps hackers interested in your honeypot, encourages them to upload new toys, and extract the maximum amount of data from them. Simulated Traffic One of the reasons most people do not see hackers doing interesting things on their honeypots is because there is nothing interesting for the attacker to play with. If you are going to find out what the attacker is all about, you need to make your honeypot interesting. One of the easiest ways to do that is to create simulated traffic to and from the honeypot. Replaying interesting traffic on the network can prompt the attacker to investigate other portions of your honeypot. Simulated traffic replayed over the wire can include e-mails, passwords, hostnames, or other common traffic. You want juicy traffic to entice the attacker to further investigate your machine and or network (honeynet). Simulated traffic can be used in conjunction with simulated targets. A simulatedtarget is where you can replay traffic from those simulated hosts to lure the attacker to further investigate those targets. Such traffic could be pop3, samba, FTP, or HTTP traffic coming from the simulated targets. Traffic from services known to have a bad security history will definitely prompt the attacker to investigate further. If you want to really see what the attacker is all about, simulate traffic that looks like someone trading MP3s, or traffic that looks like someone transferring business documents. If the attacker spends most of his time looking at the MP3 traffic, he is probably pretty harmless. If he spends his time looking at the documents, he is probably pretty dangerous. Simulated traffic can be used as a kind of referral service among honeypots. Drop some packets on the wire that contain usernames and passwords or contain hints that the really good stuff is stored at a different location. Different breeds of attackers will chase down different leads and attack your other honeypots. Simulated Targets Once an attacker has taken all the trouble to set up shop on your honeypot, he'll probably want to see what else there is to play with. If your honeypot is like most traditional honeypots, there's not much for an attacker to do once he gets in. What you really want if for the attacker to transfer down all the other toys in his arsenal so you can have a copy as well. Giving an attacker additional targets with various operating systems and services can help him decide to give you his toys. The targets can be real, but you'll get almost as much mileage if they're simulated. A good place to start is to put a phantom private network up hung off the back of the honeypot. Most corporate networks are divided into a private internal network and a public DMZ. It is poor security but common to find direct links from DMZ machines into the private network. If the attacker takes over the box and finds such a link, he is probably going to want to explore it. You cancreate whatever environment you want for him to explore. It should probably include a number of different operating systems running different services. Hopefully, the attacker will spot a service he has an exploit for and try to take it over. When the attacker transfers down the exploit, you'll get a copy to add to your library. The more compelling you make the simulated back end network; the more likely you'll get additional toys. Switching to a vulnerable OS/Service You have an exploit for a Wu-FTP server? I have one of those. Here you go. Keeping your production servers patched is a must, but keeping your honeypot patched just limits the amount of fun you can have with it. New exploits are generated for old vulnerabilities all the time. If you just ignore those exploits, you'll miss what's going on behind the scenes in the root kit development, distributed hacking tools, and anything else that requires them to actually get on your box. Nearly everyone that attempts an exploit has useful data to give. The trick is getting it from them. The best way to extract all the data is to let the exploit succeed and watch to see what they do. Even if they use an old exploit, they may use a new root kit or start up an IRC session that will lead you to some zero days. If someone has an exploit and take time out of their busy day to send it at your network, the least you can do oblige them with a root shell. To build an OS/Service switch, you'll need a public box, a switching box, and a number of boxes with various vulnerable services loaded on them. To cut down on the amount of real hardware, the vulnerable boxes can be replaced with VMWare instances. The switching box is an inline box with multiple interfaces. All new traffic is routed to the public box by default. Whenever and exploit is attempted on the public box, the IDS on the switching box looks up the OS and revision of attack and switches it over to the appropriate target. The operating system and services ofthe public box are what the attacker is going to see when he scans the box. You can use a few tricks to get people to try more exploits. One is to obfuscate the banners. Instead of having your web server identify itself as Apache, Identify it as Foobar.com front end proxy for Apache 1.3.19 and IIS 5.0 . Anything to get attackers to throw exploits at the honeypot. You're really interested in what he does after the exploit. After you're sure that you've extracted all useful data from a particular set of attackers, you can use utilities such as Hogwash or Snort-Inline to filter out that particular exploit or that particular root kit. The attacker may respond by changing their root kit or modifying their exploit in some way, but that in itself is interesting data. The hard part about running such an open honeypot is the recovery time. After each break-in, you need to clean off the attacker and reset everything for the next one. The two most popular methods are using ghost or VMWare. If you opt for ghost, you can simply ghost the drive before you put it up on the network and then restore the image as needed. With VMWare, you can keep a copy of the hacked image in an archive and the restart with a clean VMWare image. I've seen a few honeypots where the administrators used a file system mounted on a loop back interface. I believe they met with limited success. There are also some people experimenting with user-space Linux. It looks promising. Traffic Mangling Once you've got the Wiley hacker attacking your honeypot, the last thing you want to do is let him attack the rest of your network from the honeypot, or worse, attack someone else's network. A good line of defense in this instance is traffic mangling. Traffic mangling requires an inline box running software like Hogwash. The inline box can replace parts of an exploit with a broken equivalent. An example of a common mangler is to replace all instances of /bin/sh coming from the honeypot with /bin/hs. The attacker'sattempt to execv a shell on the remote box will fail. This particular mangler has provided me with hours of entertainment while I watched the attacker download his debugging tools, source code, and favorite traffic analyzers to try and find out why his exploits weren't working. A good policy is to set up manglers for all the exploits you can get your hands on and then some general rules such as replacing all sam._ with mas._ . It's impossible to stop all outgoing exploits with manglers, but it can give you peace of mind that the outside world is relatively protected from your compromised honeypot along with hours of fun watching attackers failed attempts to continue their attacks elsewhere. This implementation can be considered a form of data control which every honeypot/net should employ. Data control is a defense mechanism to stop attackers from attacking other machines or networks on the Internet from your honeypot. Connection and Byte Limiting Connection limiting can be used for both ingress and egress traffic. Connection limiting, like traffic mangling can provide you many hours of enjoyment watching intruders not understand why they can't have multiple outbound/inbound connections. If you only allow certain number of outbound connections and vice versa, these method can be somewhat easier to fingerprint, thus hinting to the attacker that he is currently on a honeynet or a system with traffic control. You can limit n number of connections inbound per x time frame. This would allow you control over your honeypot system in an attempt to control inbound recon and exploitation attempts. I have seen multiple compromises happen simultaneously. Egress connection limiting is a must for most honeypots. There are a number of ways you can go about it. You can restrict the honeypot to n simultaneous outbound connections. This will stop a number of DDOS agents and port scanning tools. As well as limit the damage an attacker can do by attempting to port-scan or even exploitexternal hosts. One of the things that will make the network folks and your ISP take an active interest in your honeypot is if you're infected with a DDOS agent. Most of the time the network admin, has his pager set to go off when the external link hits 100% saturation. To make matters worse, this usually happens at around 3 o'clock in the morning. You can limit the number of bytes transferred per second inbound or outbound. This method would be employed to stop the DDOS situation discussed above. This could also help kill some exploit attempts (e.g.: FreeBSD telnetd exploit). Unlike connection limiting, byte limiting is somewhat harder to fingerprint. A somewhat more elegant approach is to set the TCP window size in each packet to a small number. Although any of these methods will help, you should probably have a general purpose strategy to kill the honeypot if you see this process running somewhere. Bait-n-Switch The most basic, but among the most useful concepts a honeypot can be used for is to divert hackers from attacking your production network. This is commonly known as the bait-and-switch method. Bait-and-Switch consists of a production machine, the bait-and-switch machine, and a honeypot. A Bait-n-Switch honeypot needs three machines: your real web server, which can be an exact mirror of your web server minus all the sensitive data, and a BNS (Bait-And-Switch) box. Both the Honeypot and the Production web server are plugged directly into the BNS box. The BNS box runs an Intrusion Detection System. When the IDS determines that someone is an attacker, it starts redirecting the attacker's packets to the Honeypot instead of the production machine. On most networks having two machines with the same IP address is a bad thing, but that actually works in your favor with a BNS style honeypot. If the honeypot has the same IP and MAC address as the production server, the attacker may not notice that he's been switched. If he doesn't notice, you get to see all the funthings he had planned for your production server. If he does, he no longer has access to the production server and will probably go away. One current implementation of this approach is The Bait N Switch Project from Violating Networks. This method has defensive and research capabilities rolled into one system. Research comes into play once the attacker is switched and is now targeting the compromised honeypot (assuming the attack was successful). You have successfully defended your production machine and now have further research information on the attacker. Honeypots and The Law Whenever the topic of honeypots comes up, invariably there is someone who wants to debate the legalities of it. We're not lawyers, but here are some things you should think about when those inevitable discussions do come up. Entrapment is not a crime. It only applies to law enforcement and is only used as a defense to keep from going to jail. A normal citizen can't entrap anyone even if he really wants to. Trials generally have a bunch of people just like you in a jury box. If you're just trying to protect your networks, they will understand that. The legal system is not quite that messed up. Most of the time, the lawyers only get involved when there's enough money to make it worth their time. The FBI and other law enforcement agencies generally functions in the same manner. Unless you're prosecuting them, the chances of an attacker bringing any sort of legal action against you is zero. I've port scanned someone I don't know at least once a day for the last five years. I haven't seen the inside of a court room yet. Conclusions Honeypots can be a serious research endeavor, or something you can have fun with. Your fun will translate into interesting stuff for the attacker to play with. The attacker is much more likely to spend time with an interesting site than with a boring one. He probably already has all the credit card numbers and free porn he wants, but he may bewilling to send you a few more exploits for the chance to read about the affair you're having. There's no rule that says the network topology, has to be anything conventional when you're setting up your honeypot. Once someone logs in, you can present new hosts, traffic, and subnets that don't really exist. After all, they're only packets; you can craft requests and replies as well as a hacker. A honeypot is an illusion that you weave for the attacker. Your illusion can be as creative as you want it to be. A good illusion will get you zero day exploits, root kits, and loads of information on how attackers work. Above all, have fun with it. Jason Larsen Jason Larsen is the primary author of Hogwash. You can find his code is various projects including Snort , ATS, the GTK packet decoder, and a long list of others. He has been published in a number of online security journals and medical journals. He is currently the Network Security Architect for the Idaho National Engineering and Environmental Laboratories, a DOE nuclear research lab in central Idaho. Alberto Gonzalez Alberto Gonzalez is one of the leading contributors to the Bait N Switch Honeypot system. He also contributes to various other open-source projects including Hogwash and Bigeye. He is currently an Intrusion Analyst with EDS in Northern Virginia. He is also in the process of getting his GCIA certification from SANS.. Honeypots act as traps for cybercriminals and also enhance security strategies by revealing attack vectors and behaviors through decoys and analysis. Honeypot Techniques, Cybersecurity Research, Threat Simulation, Network Defense, Attack Patterns. . Brittany Day
Get the latest Linux and open source security news straight to your inbox.