Learning how to hack AEM Web Apps
5 min read
I had to cancel my Diwali vacation plan for some reason. Now, I had 10 days off work not knowing what to do. So, I decided to jump on Hackerone and practice some hacking to make good use of this time. Luckily, I managed to find six bugs in the first five days itself. This post is about the first two of those bugs, that I found on a web app running Adobe Experience Manager.
I logged into Hackerone, found a good broad scope private program and opened the main website on my browser. Usually, the first thing I do when I visit a website is check what Wappalyzer can tell me about the application. (Wappalyzer is a browser extension that shows what technology a web application is using).
In this case, Wappalyzer detected the CMS the website was using – Adobe Experience Manager (AEM). I had heard of AEM misconfigurations before. I had a few links and tweets bookmarked in my browser for the same, but I never really got the time to check any of it out. With a good ten days in hand, this was the perfect chance to finally get started with AEM hacking.
Great, but how do I start?
Twitter. Yes, that’s a good start. I started searching for tweets with the search query and hashtag “AEM #bugbounty”. I found a bunch of tweets with a bunch of test cases that I could try. But not knowing anything about AEM, it didn’t make sense to jump to exploitation directly. I stumbled upon a tweet that talked about the research Mikhail Egorov had done on AEM over the last few years, and I found links to a couple of talks where he had presented his research. (Highly recommend you check these talks, here and here ).
Watching the talks certainly helped. I knew more about AEM’s architecture than I did before. To summarize, AEM has a 3-layer architecture. The three layers are: Author, Publisher and Dispatcher layer. Author Instance is where the content is posted, Publisher is the instance that the user interacts with using the Dispatcher.
Dispatcher is the only security mechanism that protects publisher by stopping requests that do not meet policy. And things start getting really bad if the Dispatcher policy is not set correctly. Dispatcher bypasses are quite common, and a dispatcher can eventually expose servlets that ideally should not be exposed. In other words, Dispatcher bypasses allow to talk to insecure components of the publisher instances.
This can lead to Information Disclosure, XSS, SSRF and a lot of other issues. For the scope of this post, we will talk about the two issues I came across: Information Disclosure through QueryBuilderJsonServlet and HTML Injection via MergeMetadataServlet.
The dispatcher policy blocks requests when you try to access the QueryBuilderJsonServlet at the endpoint /bin/querybuilder.json However, it can be bypassed if the rules aren’t strict enough by appending the following to the endpoints.
Slide from: speakerdeck.com/0ang3el/securing-aem-webapp..
In my case .;a.css worked and I was able to bypass the dispatcher to access the QueryBuilderJsonServlet.
Cool, dispatcher bypassed. But how is it a security issue? There seems to be no impact here, right? The response to this request has not really retrieved any sensitive information. I had the same question and so I reached out to another security researcher on Twitter who seemed to know his AEM stuff. The guy suggested I read up on how to build queries manually on QueryBuilder.
So, that’s what I tried next. I started reading on how to build queries manually in Adobe’s docs. Also watched a couple YouTube playlists until I figured out a thing or two. I checked AEMSecurity’s Twitter account and found a couple of queries there as well. Fast forward, I came up with the following:
The above sample queries expose user information, UUIDs, paths on server, files on server, server snapshots, packages, etc. I reported this immediately and it turned out to be a duplicate, someone had already reported it.
MergeMetaDataServlet HTML Injection
I also found the MergeMetadataServlet exposed on the same web app, so I tried to look into exploiting that as well. The bypass for this one was quite simple, just had to add the .css extension to the metadata endpoint.
As seen in the screenshot above, the data we input in the path parameter is reflected back on the webpage. Chance for a classic reflected XSS. I immediately started testing for reflected XSS, but got blocked by the WAF in place.
The WAF was blocking almost all HTML tags, =' " and other special characters, onXXX eventHandlers and a bunch of other stuff. After a series of failed attempts to achieve XSS, I ended up reporting it as an HTML injection. Apparently, h1,h2,li,a and a few other tags were not blocked by the WAF.
Note: The response of the MergeMetadataServlet is normally in JSON. In order to execute the HTML, we need to change the Content-Type of the response from application/json to text/html. The simplest way to do that is to append the endpoint with an HTML filename (check above screenshot). The /dMr.html in the URL ensures the server changes the Content-Type of the response to HTML.
There is a lot more to security misconfigurations on AEM. There is a possible SSRF attack via Opensocial (Shindig) proxy, ReportingServicesProxyServlet, SalesforceSecretServlet, AutoProvisioningServlet. There is a possible XSS via VideoPlayer.swf, SuggestionHandlerServlet, WCMDebugFilter and MergeMetaDataServlet. There are possible DoS attacks as well. If GroovyConsole is enabled on the web server, it could directly lead to RCE as well. I highly recommend going through Mikhail Egorov’s presentations and AEMSecurity’s Twitter account for more details on these:
- Talk 1
- Talk 2
- Automating Dispatcher checks: github.com/0ang3el/aem-hacker
Until next time :)
Did you find this article valuable?
Support Mohammed Arbaaz Shaikh by becoming a sponsor. Any amount is appreciated!