Email spamEmail spam, also referred to as junk email, spam mail, or simply spam, is unsolicited messages sent in bulk by email (spamming). The name comes from a Monty Python sketch in which the name of the canned pork product Spam is ubiquitous, unavoidable, and repetitive. Email spam has steadily grown since the early 1990s, and by 2014 was estimated to account for around 90% of total email traffic. Since the expense of the spam is borne mostly by the recipient, it is effectively postage due advertising.
Anti-spam techniquesVarious anti-spam techniques are used to prevent email spam (unsolicited bulk email). No technique is a complete solution to the spam problem, and each has trade-offs between incorrectly rejecting legitimate email (false positives) as opposed to not rejecting all spam email (false negatives) – and the associated costs in time, effort, and cost of wrongfully obstructing good mail.
EmailElectronic mail (email or e-mail) is a method of transmitting and receiving messages using electronic devices. It was conceived in the late–20th century as the digital version of, or counterpart to, mail (hence e- + mail). Email is a ubiquitous and very widely used communication medium; in current use, an email address is often treated as a basic and necessary part of many processes in business, commerce, government, education, entertainment, and other spheres of daily life in most countries.
Email encryptionEmail encryption is encryption of email messages to protect the content from being read by entities other than the intended recipients. Email encryption may also include authentication. Email is prone to the disclosure of information. Most emails are encrypted during transmission, but they are stored in clear text, making them readable by third parties such as email providers. By default, popular email services such as Gmail and Outlook do not enable end-to-end encryption.
SpammingSpamming is the use of messaging systems to send multiple unsolicited messages (spam) to large numbers of recipients for the purpose of commercial advertising, for the purpose of non-commercial proselytizing, for any prohibited purpose (especially the fraudulent purpose of phishing), or simply repeatedly sending the same message to the same user.
MD5The MD5 message-digest algorithm is a widely used hash function producing a 128-bit hash value. MD5 was designed by Ronald Rivest in 1991 to replace an earlier hash function MD4, and was specified in 1992 as RFC 1321. MD5 can be used as a checksum to verify data integrity against unintentional corruption. Historically it was widely used as a cryptographic hash function; however it has been found to suffer from extensive vulnerabilities.
Naive Bayes spam filteringNaive Bayes classifiers are a popular statistical technique of e-mail filtering. They typically use bag-of-words features to identify email spam, an approach commonly used in text classification. Naive Bayes classifiers work by correlating the use of tokens (typically words, or sometimes other things), with spam and non-spam e-mails and then using Bayes' theorem to calculate a probability that an email is or is not spam.
Genetically modified organismA genetically modified organism (GMO) is any organism whose genetic material has been altered using genetic engineering techniques. The exact definition of a genetically modified organism and what constitutes genetic engineering varies, with the most common being an organism altered in a way that "does not occur naturally by mating and/or natural recombination". A wide variety of organisms have been genetically modified (GM), from animals to plants and microorganisms.
Email spoofingEmail spoofing is the creation of email messages with a forged sender address. The term applies to email purporting to be from an address which is not actually the sender's; mail sent in reply to that address may bounce or be delivered to an unrelated party whose identity has been faked. Disposable email address or "masked" email is a different topic, providing a masked email address that is not the user's normal address, which is not disclosed (for example, so that it cannot be harvested), but forwards mail sent to it to the user's real address.
Cryptographic hash functionA cryptographic hash function (CHF) is a hash algorithm (a map of an arbitrary binary string to a binary string with a fixed size of bits) that has special properties desirable for a cryptographic application: the probability of a particular -bit output result (hash value) for a random input string ("message") is (as for any good hash), so the hash value can be used as a representative of the message; finding an input string that matches a given hash value (a pre-image) is unfeasible, assuming all input str