Introduction to IMAP
Published on:Table of Contents
Introduction
This will be a detailed, though not exhaustive, quickstart into using IMAP.
Initially this was also going to highlight the python library,
imaplib
,
but the post became too long! Maybe next time.
The hope is that this’ll contain enough information about querying email servers that additional questions would most likely be redirected to the spec or subsequent specs linked to in the post. I’m also interested in plenty of examples so my 2am self doesn’t need to overwork itself coming up with queries.
Saying that, when the need to programmatically read email, one should always reach for IMAP,
which is so easy to get started with; you only need a terminal with telnet
and/or openssl
. Since the telnet
version will mean insecure communications,
I’ll only show examples with openssl
.
The basic command to remember is
# For gmail: HOST=imap.gmail.com
# For outlook/hotmail: HOST=imap-mail.outlook.com
openssl s_client -connect ${HOST}:${PORT:-993} -quiet -crlf
# Alternatively (not as powerful because curl can't support
# multiple commands per invocation):
# https://curl.haxx.se/mail/archive-2013-12/0022.html
curl --url "imaps://<host>" --user <user> --request <command>
Outlook Caveat
Those with 2 factor authentication on their accounts will need to create an app password to login by following Microsoft’s Using app passwords with apps that don’t support two-step verification. Else one will receive a very unhelpful “NO LOGIN failed” error.
Gmail Caveat
The best way to allow command line access to your Gmail inbox:
- Enable IMAP access
- Add an app password to use as the account login password.
Time for some learning
For any serious IMAP command line usage, I recommend using
rlwrap, which will allow for command
completion and history. With IMAP, I invoke rlwrap
like the following, which
will ensure that IMAP login commands (which contain your password) aren’t
leaked in the history:
rlwrap -g LOGIN openssl s_client -connect <host>:<port> -quiet -crlf
The baptism by fire example will:
- Login with your email and password
- List all the folders in the inbox
- Select the “Inbox” folder
- Search for all the emails from [email protected]
- Pick out just the date when the email was received
Below is a chunk of IMAP commands and their outputs. It will be broken down subsequently.
* OK Outlook.com IMAP4rev1 server version 17.4.0.0 ready (BAY451-IMAP411)
1 LOGIN <email> <password>
* CAPABILITY IMAP4rev1 CHILDREN ID NAMESPACE UIDPLUS UNSELECT
1 OK <email> authenticated successfully
2 LSUB "" "*"
* LSUB (\HasNoChildren) "/" "Inbox"
* LSUB (\HasNoChildren \Trash) "/" "Deleted"
* LSUB (\HasNoChildren \Sent) "/" "Sent"
* LSUB (\HasNoChildren \Drafts) "/" "Drafts"
* LSUB (\HasNoChildren \Junk) "/" "Junk"
* LSUB (\HasNoChildren) "/" "nbsoftsolutions"
2 OK LSUB completed
3 SELECT "Inbox"
* FLAGS (\Answered \Flagged \Deleted \Seen \Draft $Forwarded)
* 6951 EXISTS
* 0 RECENT
* OK [UIDVALIDITY 105499179] UIDs valid
* OK [UIDNEXT 106952] Predicted next UID
* OK [PERMANENTFLAGS (\Answered \Flagged \Deleted \Seen \Draft)] Limited
3 OK [READ-WRITE] SELECT completed.
4 SEARCH FROM [email protected]
* SEARCH 1789 1791
4 OK SEARCH Completed
5 FETCH 1789,1791 (BODY[HEADER.FIELDS (DATE)])
* 1789 FETCH (FLAGS (\Seen) BODY[HEADER.FIELDS ("DATE")] {41}
Date: Thu, 27 Jun 2013 08:29:16 -0700
)
* 1791 FETCH (FLAGS (\Seen) BODY[HEADER.FIELDS ("DATE")] {41}
Date: Thu, 27 Jun 2013 13:06:36 -0700
)
5 OK FETCH Completed
If restricting the example to just commands I sent, it would boil down:
1 LOGIN <email> <password>
2 LSUB "" "*"
3 SELECT "Inbox"
4 SEARCH FROM [email protected]
5 FETCH 1789,1791 (BODY[HEADER.FIELDS (DATE)])
Notice the incrementing numbers? Yes, I manually typed those in, and no it
doesn’t matter what you type. Many people use ?
or gibberish but
incrementing numbers is good practice. The point is, they’re used for
identifying commands, so it’s probably a good idea to make them unique!
From the spec:
The client command begins an operation. Each client command is prefixed with an identifier (typically a short alphanumeric string, e.g., A0001, A0002, etc.) called a “tag”. A different tag is generated by the client for each command.
And the reason why we use SSL is because of the LOGIN
command:
LOGIN command uses a traditional user name and plaintext password pair and has no means of establishing privacy protection or integrity checking.
You may have noticed that some lines terminate with {n}
. n
is the number of
bytes remaining left to the response. This is important because IMAP is a line
based protocol, and {n}
is the way to transmit data that spans multiple
lines. So the {41}
means that there are 41 additional bytes to the message.
By selecting the “Inbox” folder, I’m saying that I want to read and write (manipulate) the folder.
Examine vs Select
Prefer EXAMIME
over SELECT
for read-only behavior
The EXAMINE command is identical to SELECT and returns the same output; however, the selected mailbox is identified as read-only. No changes to the permanent state of the mailbox, including per-user state, are permitted; in particular, EXAMINE MUST NOT cause messages to lose the \Recent flag.
Why do we care about this the \Recent
flag? What is it even for?
Message is “recently” arrived in this mailbox. This session is the first session to have been notified about this message; if the session is read-write, subsequent sessions will not see \Recent set for this message. This flag can not be altered by the client.
If there are multiple clients connecting to the same inbox, and
if any of these clients rely on the \Recent
flag, these clients
may not be notified of new messages if one uses SELECT
over
EXAMINE
In reality, a client that relies on the \Recent
flag being present is making
too big of an assumption and should use other polling mechanisms such as
searching for unread messages. I would still recommend using EXAMINE
because
it’s better to treat an object as immutable when the opportunity arises. A good
example of this is the \Seen
flag (a message without this will appear bold in
your inbox, ie. unread), as when issuing a FETCH
command, there are some
parts of a message that, if retrieved, will implicitly mark the message as
read. This can have disastrous effects if someone is expecting unread messages
to truly be unread, and an EXAMINE
command may avoid this problem. The spec
does not require this exact immutability behavior so check with the IMAP server
before commiting.
Message Sequence Numbers vs UID
Each message in IMAP has two numbers: it’s message sequence number, and it’s
unique identifier. The unique identifier is pretty self-explanatory, with a few
caveats following, and the message sequence number is the relative position
from the oldest message in the folder. If messages are deleted, sequence
numbers are reordered to fill any gaps. As can be imagined this is a source of
many mistakes because if you’re looping through a list of message sequence
numbers ascendingly, deleting messages as you go, you’ll end up deleting the
wrong messages. The imaplib
highlights this problem:
After an EXPUNGE command performs deletions the remaining messages are renumbered, so it is highly advisable to use UIDs instead, with the UID command.
So how does one use the UID
command? Surprisingly easy. Take whatever command
you were going to execute and prefix it with UID
. We’ll modify the example
earlier to use UIDs instead.
1 LOGIN <email> <password>
2 LSUB "" "*"
3 SELECT "Inbox"
4 UID SEARCH FROM [email protected]
5 UID FETCH 101789,101791 (BODY[HEADER.FIELDS (DATE)])
The one caveat with UIDs is that, while they’re not supposed to change, the spec allows for some wiggle room:
The unique identifier of a message MUST NOT change during the session, and SHOULD NOT change between sessions.
One can tell if the UIDs have changed by looking at the UIDVALIDITY
response
when examining an inbox. If the number has changed from the last time then UIDs
gathered previously may be worthless. However, I believe in practice this does
not happen because too many applications would break. The spec strongly
suggests that:
The combination of mailbox name, UIDVALIDITY, and UID must refer to a single immutable message on that server forever.
As a result I would keep this in the back of your mind when sharing UIDs across connections (either concurrent connections or sequential). RFC4549, Synchronization Operations for Disconnected IMAP4 Clients contains several good quotes about this situation.
if UIDVALIDITY value returned by the server differs, the client MUST
- remove any pending “actions” that refer to UIDs in that mailbox and consider them failed
And dovecot, probably the open source IMAP server states:
[UIDVALIDITY] shouldn’t normally change, because if it does it means that client has to download all the messages for the mailbox again.
Even single threaded implementations may get into a sticky situation when
moving a message, as only part of the operation may complete because moving is
composed of a COPY
+ STORE
(unless your IMAP supports the
MOVE
command) and the connection may
be disconnected after the COPY
completes but before the STORE
finishes. The
RFC writes that the
UIDPLUS
extension alleviates accidentally downloading the message twice.
The one advantage of message sequence numbers over UIDs is that math can be
done with the sequence numbers (eg. messages 1:10
means
there are a total of 10 messages in the set. Seems like a small advantage, but some people
like it.
Search Examples
Find all messages in an inbox
? SEARCH ALL
Find messages with a flag set
? SEARCH ANSWERED
? SEARCH DELETED
? SEARCH DRAFT
? SEARCH FLAGGED
? SEARCH SEEN
? SEARCH RECENT
Date searching. The first three examples use the RFC-2822 Date header while the last three use the internal date. A message’s internal date “reflects when the message was received” whereas the Date header is for “specifying the date and time at which the creator of the message indicated that the message was complete and ready to enter the mail delivery system”. Testing as shown that querying on the internal date (the last three examples) is two orders of magnitude faster, and the message date and the internal date should be close if not equivalent.
The intervals specified are inclusive, so SINCE 12-Mar-2016
includes the
messages received on March 12th.
? SEARCH SENTBEFORE 12-Mar-2016
? SEARCH SENTON 12-Mar-2016
? SEARCH SENTSINCE 12-Mar-2016
? SEARCH SINCE 12-Mar-2016
? SEARCH ON 12-Mar-2016
? SEARCH BEFORE 12-Mar-2016
It is also possible to use the WITHIN Search Extension that defines the two search keys, OLDER and YOUNGER; representing the number of seconds from the server’s current time to fetch messages. The examples query messages that are younger or older than an hour.
? SEARCH YOUNGER 3600
? SEARCH OLDER 3600
Query on message properties. “A message matches the key if the string is a substring of the field. The matching is case-insensitive.”
? SEARCH TO [email protected]
? SEARCH FROM [email protected]
? SEARCH CC [email protected]
? SEARCH BCC [email protected]
? SEARCH BODY github
? SEARCH HEADER RECEIVED foo
Composing multiple search criteria. The only thing special is that the operators are written in Polish notation:
? SEARCH FROM [email protected] SINCE 12-Mar-2016
? SEARCH OR FROM [email protected] FROM [email protected]
? SEARCH OR (FROM [email protected]) (FROM [email protected])
? SEARCH OR OR FROM [email protected] FROM [email protected] FROM [email protected]
? SEARCH OR (FROM [email protected] SINCE 12-Mar-2016) FROM [email protected]
? SEARCH NOT (OR (FROM [email protected]) (BEFORE 12-Mar-2016))
? SEARCH NOT SEEN
? SEARCH UNSEEN
And to retrieve message UIDs you can prefix the search command with UID
? UID SEARCH SINCE 12-Mar-2016
? UID SEARCH OR FROM [email protected] FROM [email protected]
? UID SEARCH TO [email protected]
Searching can also be done on UIDs. Keep in mind the last example may be a good strategy a for mailbox listener to process all the UIDs after the last seen and any unseen messages.
? UID SEARCH UID 1:*
? SEARCH UID 1:*
? UID SEARCH OR (UID 1:*) (UNSEEN)
ESEARCH Examples
If the server supports the ESEARCH extension, a few more possibilities open up:
Count the number of UNSEEN messages and return the first message/UID.
? SEARCH RETURN (MIN COUNT) UNSEEN
? UID SEARCH RETURN (MIN COUNT) UNSEEN
The ESEARCH extensions can also condense message sets to cut down on transmission costs.
It’s better to receive 8 bytes of 1:300000
than the ~1.5MB if the message ids were
written individually.
? SEARCH RETURN () UNSEEN
? SEARCH RETURN (ALL) UNSEEN
SEARCHRES Examples
Remembering what messages were returned when doing a SEARCH
can require unnecessary work,
getting the message ids and then parsing them. rfc5182,
Referencing the Last SEARCH Result, allows the result set of a SEARCH
to be saved
and referenced in a subsequent command as $
. The documentation for the extensions already
contains numerous examples, so I’ll copy and reduce them.
Find all the messages from github and then retrieve some metadata about those messages.
? SEARCH RETURN (SAVE) FROM [email protected]
? FETCH $ (UID INTERNALDATE FLAGS)
More cool ways to use $
? SEARCH (OR $ 1,3000:3021)
? MOVE $ "Other Messages"
To see how SEARCHRES
interacts with ESEARCH
, check out the rfc.
Fetch Examples
For the fetch examples, I’ll be using .PEEK
where I can so that these
examples won’t implicitly mark the message as being seen. In my opinion the
only way a message should be marked as seen is if an explicit command does set
the flag (but I didn’t write the spec, oh well!)
Fetch the contents of the email message:
? FETCH 1 BODY.PEEK[TEXT]
Fetch the header of the message:
? FETCH 1 BODY.PEEK[HEADER]
Fetch header and contents of email message
? FETCH 1 BODY.PEEK[]
Fetch specific parts of the header (the examples are complementary)
? FETCH 1 BODY.PEEK[HEADER.FIELDS (Date From)]
? FETCH 1 BODY.PEEK[HEADER.FIELDS.NOT (Date From)]
Fetch metadata about the message
? FETCH 1 FLAGS
? FETCH 1 ENVELOPE
? FETCH 1 INTERNALDATE
? FETCH 1 RFC822.SIZE
? FETCH 1 BODYSTRUCTURE.PEEK
? FETCH 1 UID
Fetches can be composed
? FETCH 1 (BODYSTRUCTURE.PEEK UID)
? FETCH 1 (BODYSTRUCTURE.PEEK UID RFC822.SIZE)
If you only want some of a field this is also possible through <start-index.length>
? FETCH 1 (BODYSTRUCTURE.PEEK BODY.PEEK[]<0.200>)
Store Examples
Mark message as seen and deleted in addition to whatever flags may be present
? STORE 1 +FLAGS (\Deleted \Seen)
Unmark a message as seen and deleted, so it’ll show up in the inbox as unread.
? STORE 1 -FLAGS (\Deleted \Seen)
Completely replace the flags of a message with those provided
? STORE 1 FLAGS (\Deleted \Seen)
Alternatively, if a server response for STORE
is not wanted then one can
specify FLAGS.SILENT
in any of the previous examples.
Concurrent Commands
The client MAY send another command without waiting for the completion result response of a command. […] Similarly, a server MAY begin processing another command before processing the current command to completion
However, if we try using our trusty s_client
and pasting the following in,
the commands will be executed sequentially after a brief delay (at least for outlook).
3 SEARCH FROM [email protected]
4 FETCH 1500 (BODY[HEADER.FIELDS (DATE)])
5 SEARCH FLAGGED SINCE 1-Feb-1994 NOT FROM "Smith"
6 SEARCH HEADER X-FOO ""
7 SEARCH FROM [email protected]
8 SEARCH TEXT "string not in mailbox"
So it looks like the MAY is taken to heart and I would not recommend relying on other behavior. Instead, if two independent commands need to be sent, open another connection.
The IDLE and NOTIFY Commands
The IDLE command, as described by rfc2177, is a simple way to have the server let the client know what’s going on without the client having to periodically poll the server.
Once we execute IDLE
, the server will continuously stream simple responses about mailbox changes. The following example shows these simple responses.
* 0 RECENT
* 4 EXISTS
* 1 RECENT
* 5 EXISTS
* 2 RECENT
IDLE
responses include more than notifications of new messages. For fun, try marking a message as unread and then read in a 3rd party client. You’ll see the IDLE
stream include:
* 33558 FETCH (FLAGS ())
* 33558 FETCH (FLAGS (\Seen))
Here’s an example of me deleting a message in another client. We can see that the \Deleted
flag is
being marked on the messages followed by an EXPUNGE
* 4 FETCH (FLAGS (\Seen \Recent))
* 4 FETCH (FLAGS (\Deleted \Seen \Recent))
* 5 FETCH (FLAGS (\Seen \Recent))
* 5 EXPUNGE
Using IDLE
can be a good way to debug faulty or misbehaving IMAP clients as one can easily see the actions taken against a mailbox.
Since the server can disconnect the client after 30 minutes, the spec recommends re-issuing the IDLE command every 29 minutes.
IDLE’s newer and more powerful brother is the NOTIFY
command (rfc5465). Since IDLE
predates NOTIFY
by
over a decade (1997 vs 2009), don’t expect most mail servers to support
NOTIFY
. If your mail server does support NOTIFY
, then make sure to use it!
Some of the benefits that NOTIFY provides:
- Watch more than one mailbox with a single connection
- Able to pick and choose what mailbox operations the connection receives
- Able to customize what is returned on said mailbox operations
So if your email represents a message queue, NOTIFY
could be the command for you! No more polling and
no more secondary FETCH
commands to retrieve data you need.
NOTIFY is much more complicated than IDLE, probably needlessly so. There are hardly any examples, so I’ll do
my best to add to them. I’ve found that between servers the support NOTIFY varies with some NOTIFY servers
rejecting examples from the spec with less than helpful error messages (Invalid Arguments
)
To start simple, we’ll watch the mailbox we have selected for new messages and when messages are expunged. When a new message arrives we’ll also fetch it’s UID and some header fields
? NOTIFY SET (SELECTED (MessageNew (UID BODY.PEEK[HEADER.FIELDS (FROM DATE)]) MessageExpunge))
Quick tip if you want to subscribe to new messages, you’ll also have to subscribe to expunged messages
If one of MessageNew or MessageExpunge is specified, then both events MUST be specified. Otherwise, the server MUST respond with the tagged BAD response.
To turn off notification:
? NOTIFY NONE
Optionally one can provide a STATUS
tag to the beginning of the command as shown below.
Not really sure what it enables.
? NOTIFY SET STATUS (SELECTED (MessageNew MessageExpunge))
The server may be finicky with notifications and may give you the NOTIFICATIONOVERFLOW
when:
the server is unable or unwilling to deliver as many notifications as it is being asked to.
I notice this most frequently when specifying notifications for more than one mailbox. If you’re not working with this restrictions, other usages are highlighted below:
? NOTIFY SET (SELECTED (MessageNew MessageExpunge)) (mailboxes postmaster (MessageNew MessageExpunge))
? NOTIFY SET (personal (SubscriptionChange)) (mailboxes postmaster (MessageNew MessageExpunge))
? NOTIFY SET (INBOXES (SubscriptionChange))
Section 6 gives a better overview of the different inboxes that can be selected.
On an interesting note, one can send commands while a NOTIFY is in progress and also switch
to another mailbox. This has the effect of modifying what SELECTED
mailbox the NOTIFY
command is referring to.
Conclusion
This was a quickstart to IMAP and some of the more important extensions (I’m
biased). I didn’t cover many things. There are still more commands in the IMAP
spec to gloss over and many more extensions. And what I covered was the IMAP
happy path. Numerous servers don’t support the features I showed or will
respond with NO
or BAD
, which a good IMAP client should deal with.
Happy IMAPing
Comments
If you'd like to leave a comment, please email [email protected]
While I’ve never done admin impersonation with IMAP. I think it may be possible using the AUTHENTICATE
command with PLAIN
authentication.
The authentication string will be a base64 encoding of user + admin + admin password, similarly to what’s shown in this stackoverflow answer.
Let me know if it works!
We tried this query:
TO "[email protected]" NOT SUBJECT "don't-look-here" SUBJECT "do-look-here"
And we get this error:
Unknown search criterion: NOT (errflg=2) in Unknown on line 0
Any ideas?
Lorena, are you potentially using PHP to send your queries? Searching for the error “Unknown search criterion” only shows PHP hits. Perhaps try removing PHP from the equation and see if the issue persists? Following this post’s example of using openssl (+ rlwrap) to test IMAP commands, I was able to verify that your query is syntactically fine.
I’m trying to understand some quirks about the imap4rev1 and implementations by yahoo, gmail, etc…
In a particular imap server (say gmail), I login then I select a mailbox, and it will return the usual info starting with:
* FLAGS (\Answered \Flagged \Draft \Deleted \Seen $NotPhishing $Phishing -flags a-funny-flag another-flaguy)
* OK [PERMANENTFLAGS (\Answered \Flagged \Draft \Deleted \Seen $NotPhishing $Phishing -flags a-funny-flag another-flaguy \*)] Flags permitted.
(...)
According to rfc3501 you have Flags and Keywords, and the servers that support those, like gmail, will distinguish flags (prefixing them with backslash), and the keywords (everything else) and letting us know that we can have “custom” flags by showing the * (backslash asterisk)
To add a keyword it’s as simple as adding a new flag to any message in that mailbox like:
s STORE nnn +FLAGS yyy ← where nnn is the seqnum and yyy is your desired flag
You can then remove the flag from the message, but… the custom flag (aka keyword) will remain even across sessions.
My question is, and I’ve search without success, how does one remove keywords?
Hey Bruno,
Keywords and flags are removed the same way. I confirmed this behavior with gmail, other IMAP providers may differ:
1 STORE 1661 +FLAGS HELLO
* 1661 FETCH (FLAGS (\Flagged HELLO))
1 OK Success
2 FETCH 1661 FLAGS
* 1661 FETCH (FLAGS (HELLO \Flagged))
2 OK Success
3 STORE 1661 -FLAGS HELLO
* 1661 FETCH (FLAGS (\Flagged))
3 OK Success
4 FETCH 1661 FLAGS
* 1661 FETCH (FLAGS (\Flagged))
4 OK Success
In the example above I add the HELLO keyword, confirm it exists, remove it, and confirm it has been removed.
Hi Nick, thanks so much for your answer.
So, that much I knew about the +FLAGS and -FLAGS commands, however if you look closely, even when you remove all KEYWORDS (the custom flags) they remain in the PERMANENTFLAGS as in… they are still available in case you need them again.
I found a discussion from 100 years ago here → https://www.dovecot.org/list/dovecot/2010-March/047925.html ← in which it seems that PERMANENTFLAGS are in a sense PERMANENT, that means, you even when no message has that custom flag assigned, the server will still keep that custom flag.
Go back to your example and check whether or not your email server still has a record of this custom flag HELLO
I don’t know why this bothers me, but unused flags ought to be purged if the user so wishes. And it seems that the RFC specs don’t require this.
Ah, thanks for the clarification. I can confirm that the keyword remains in the permanent flags even when no messages reference it (in gmail’s implementation)
Great investigation! I wasn’t aware of this behavior. Hopefully you aren’t seeing consequences from unused keywords.
I’ll set a reminder for myself and see if gmail periodically purges these.
I’ve been struggling with this issue for a couple of weeks now. I have a “on premise” mail server (Icewarp) - I want to migrate it to Office 365 using IMAP tools but I can’t find a way to logon using IMAP and grab the content of other user mailbox using the Admin account (which I gave it Full access to all mailbox). My goal is to migrate all accounts without having to reset everyone’s password.
I’ve tried to logon using openssl with the following syntax:
It really seems that I can’t find the correct syntax to use the admin account and access the other user mailbox.
Any suggestions you may think of?