Whilst conducting security testing and assurance activities, I went looking to show logon events in Office 365. My first query was on IdentityEvents, this led to a view of a multi month attack by a threat actor/s against a tenent, followed by exploring the rabbit hole of logs and computer systems. This blog summarises some of the methods and findings when considering threat hunting and authentication defences for Office 365. (bear with me I am tired so this might need a bit of a tune up later!)
Here is where it all started:
IdentityLogonEvents | where TimeGenerated > ago(90d) | where ActionType contains “failed” | sort by TimeGenerated desc | summarize count() by bin(TimeGenerated, 1h), AccountName, FailureReason | sort by count_ | render columnchart |
We enriched the IP data with IPINFO:
We also looked at the dataset in GreyNoise:
As you can imagine, I wanted to make sure the services were safe and that no unauthorized access had occurred.
SigninLogs | where TimeGenerated > ago(90d) | where ResultType == “50053” | project TimeGenerated, UserId, UserPrincipalName, UserType ,ResultDescription, ResultType, Location, AppDisplayName, SourceSystem, IPAddress | summarize count() by bin(TimeGenerated, 1h), UserId | sort by count_ | render columnchart |
Advanced Hunting/Sentinel Data Sources
- IdentityLogonEvents
- SignInLogs
You will also see activity in the UAL:
Controls
- Smart LockOut (Prevent attacks using smart lockout – Azure Active Directory – Microsoft Entra | Microsoft Learn)
- Conditional Access (What is Conditional Access in Azure Active Directory? – Microsoft Entra | Microsoft Learn)
- Risky Sign Ins
- Multi Factor Authentication
- Monitoring and Alerting
Authentication methods – Microsoft Azure
Example Queries
SignInLogs
These are the primary location to analyse.
SigninLogs | where TimeGenerated > ago(90d) | where ResultType == “50053” | project TimeGenerated, UserId, UserPrincipalName, UserType ,ResultDescription, ResultType, Location, AppDisplayName, SourceSystem, IPAddress | summarize count() by ResultDescription, ResultType, Location | sort by count_ desc |
A key element here is that the eventType 50053 and descriptions will appear slightly differently depending on what log you review.
SignInLogs will show SmartLock blocking the accounts. “Account is locked because user tried to sign in too many times with an incorrect ID or password.”
The wording here is not incredibly clear here, but this is SMART LOCKOUT working. This WILL NOT cause a denial of service to legitimate users or existing sessions.
SigninLogs | where TimeGenerated > ago(90d) | where ResultType == “50053” | project TimeGenerated, UserId, UserPrincipalName, UserType ,ResultDescription, ResultType, Location, AppDisplayName, SourceSystem | summarize count() by ResultDescription, ResultType, Location |
Identity Logon Events
IdentityLogonEvents | where TimeGenerated > ago(90d) | where ActionType == “LogonFailed” | where LogonType == “OAuth2:Token” | summarize count() by bin(TimeGenerated, 1d), AccountName | render columnchart |
IdentityLogonEvents | where TimeGenerated > ago(90d) | where ActionType != “LogonFailed” | where LogonType == “OAuth2:Token” | summarize count() by bin(TimeGenerated, 1d), AccountName | render columnchart |
IdentityLogonEvents | where TimeGenerated > ago(90d) | where ActionType != “LogonFailed” | where LogonType == “OAuth2:Token” | summarize count() by bin(TimeGenerated, 1d), Location | render columnchart |
IdentityLogonEvents | summarize count() by ActionType, Application, LogonType | sort by count_ desc |
IdentityLogonEvents | where TimeGenerated > ago(90d) | summarize count() by ActionType, Application, LogonType, AccountName, AccountUpn | sort by count_ desc |
IdentityLogonEvents | where TimeGenerated > ago(2h) | sort by TimeGenerated desc | where ActionType != “LogonSuccess” //| project TimeGenerated, ActionType, Application, LogonType, AccountName, AccountDomain, Location, ISP, IPAddress, FailureReason //| project TimeGenerated, ActionType, Application, LogonType, Location, ISP, FailureReason | sort by TimeGenerated desc //| summarize count() by bin(TimeGenerated, 1d), Location //| render columnchart |
Testing Tools
0xZDH/Omnispray: Modular Enumeration and Password Spraying Framework (github.com)
References
Smart Lockout & Logging
Heads up: you need an Azure AD Premium P1 license for this feature:
Ok I’m not going to dissect every packet but when we authenticate to Office 365 from the internet this is a probable pattern:
- Our DNS client request resolution.
- An ANYCAST IP is returned
- Out CLIENT attempt to connect, the nearest regional Datacenter point responds (global and fast)
- We then attempt to authenticate
- This is where the SMART LOCKOUT process comes into effect.
Threat actors often use botnets, rotating proxy services and/or open redirects etc. So a threat actor can easily send loads of requests from thousands/hundreds of thousands of IPs if they so have the motivation, means, access etc. This is where smart lockout is going to help us, but it will obviously look “very interesting” from a logging point of view (if you go looking).
It’s useful to know how this logging works, it’s useful threat intelligence, just because a threat actors attack hasn’t worked:
- It may do in the future
- You may be being targeted
- You may want to assure controls and validate configuration for targeted (and other) identifies in your environment
Summary
Late at night, when I first saw 50k logon attempts I clearly realised this needed investigating, the interesting thing was there were no alerts/incidents. So I had a few questions:
- Where is this coming from?
- Why are they not alerting?
- Are the accounts being locked out and is this causing Denial of Service (DoS)?
- Can we block these?
- Can we prove the identifies only have legitimate access?
After running through a range of activity, learning, and testing I can now answer these questions. I’ve put this rapid publish post together to help other people.
- Where is this coming from? An unknown threat actor/s possibly in the Eastern area of the globe (based on percentage of source traffic from China and India + time correlation) (however this is not confirmed)
- Why are they not alerting? The volume of events would create an unmanageable number of alerts, the authentication attempts were not successful as well.
- Are the accounts being locked out and is this causing Denial of Service (DoS)? Partially, it’s stopping the attackers from signing in (even if they had valid credentials (we do not believe they do))
- Can we block these? They are being blocked
- Can we prove the identifies only have legitimate access? Yes, we have checked and believe these are all “not compromised”
So, great stuff, the only thing I would like however is for the log descriptions to be a bit clearer, it wasn’t immediately clear that SMART LOCKOUT was causing the locks and then the question of DoS came into my mind, it took me quite some effort to get round to the “oh ok that’s not causing DoS”.
We have layered controls here, but the fact remains someone is sending a large volume of authentication attempts to a small number of identifies for several months. Without looking at these logs I would never have known that…. Some might argue ignorance is bliss… but I’ve not really found that to be the case with cyber security.
Also CHATGPT – https://twitter.com/SU1PHR/status/1612820176528936962?s=20&t=fsqfLpruWTvOBsS4hRoUzQ
Maybe I need to hang up my blogging hat….