In this short post I'll tell the story of how we debugged an issue that prevented us from starting YARN.
Our problem was that we tried to start a Hadoop YARN ResourceManager and failed.
We configured it to enable SSL/TLS and read its keystore from /opt/hadoop/keystore.jks
with these permissions hdfs:hadoop 640
.
We started the ResourceManager as user yarn
who belongs to the group hadoop
.
Everything should work, right? keystore.jks
belongs to group hadoop
. Members of that group, which we are, should be able to read it.
Instead we got a permission denied
exception like this:
java.io.FileNotFoundException: /opt/hadoop/keystore.jks (Permission denied)
Of course we went on Google to figure out what went wrong and everyone was telling us that we need to have the proper permissions on every part of the path.
So we also need to have read and execute permissions on /opt
and /opt/hadoop
. Unfortunately that was already the case.
This post details all the steps we took to debug the issue and finally provides the solution.
Let's start with a quiz. Something that I didn't know either but learned while investigating this issue.
Running this code:
mkdir test chmod 777 test cd test touch foobar chmod 700 foobar cat foobar chmod 070 foobar cat foobar
What result would you expect, assuming foobar
belongs to your own user and group?
I expected it to show me the result two times.
What really happens however is a permission denied error on the second cat
:
cat: foobar: Permission denied
We did turn to the POSIX spec but that (IMO) is not super clear. Wikipedia however gave us the answer:
The effective permissions are determined based on the first class the user falls within in the order of user, group then others. For example, the user who is the owner of the file will have the permissions given to the user class regardless of the permissions assigned to the group class or others class.
Ah! So it checks my user and that user does not have access so it doesn't even consider the group or other classes.
Anyway, back to the issue at hand.
We checked everything we could think of:
First we checked SELinux using sestatus
but it was disabled.
Next, we tried logging in as yarn
and read the file from the bash.
Surprisingly that also worked. Weird.
Then we checked (using ps
) whether the ResourceManager process actually does run as the proper user.
At this time we were down to checking the uid and gid and did not rely on the names itself.
But...those also matched.
To recap: A Java process running as user yarn
gets a Permission denied
reading a file. A bash shell logged in as yarn
can read said file.
As the next step we ripped out the exact YARN code that accesses the file, which boils down to something like this:
FileInputStream stream = new FileInputStream(file); int firstByte = stream.read(); stream.close();
Which is not very interesting, but we ran it anyway. It worked!
Next step: Attaching a debugger to the running ResourceManager process. jdb to the rescue as we didn't have access to the server directly to attach a remote debugger on our machines.
We attached to the ResourceManager and ran the following statements:
new File("/opt").canRead()
returns true
new File("/opt/hadoop").canRead()
returns false
(WAT?)System.getProperty("user.name")
returns yarn
A reminder, this is how our directory structure looks like:
/opt root:root drwxr~xr~x
/opt/hadoop hdfs:hadoop drwxr~xr~x
/opt/hadoop/keystore.jks hdfs:hadoop rw~r~~~~~
Okay, this is confusing. What's going on?
At this point we dug through the OpenJDK source code and ended up in the Linux kernel sources. Java, in the end, uses a open
syscall which returns -1
.
In order to replicate this I've written a short C program which narrows it down as much as possible:
#include <stdio.h> #include <unistd.h> #include <errno.h> #include <string.h> #include <sys/types.h> #include <sys/stat.h> #include <fcntl.h> int main(int argc, char *argv[]) { printf("%d\n", access(argv[1], R_OK)); printf("%s\n", strerror(errno)); printf("%d\n", open(argv[1], O_RDONLY)); printf("%s\n", strerror(errno)); }
Surprise: This also worked perfectly.
Back to the debugger. This time we created a simple shell script that did nothing but echo the result of id -Gn
and id yarn
to a file.
We put that shell script in /var/lib/hadoop-yarn/testscript.sh
, made it executable and owned by yarn
.
Then we went back to jdb
to execute this piece:
print Runtime.getRuntime().exec("/var/lib/hadoop-yarn/testscript.sh")
And this is how we finally got one step closer.
The result was:
id
is missing our hadoop
group which would have given us access to the fileid yarn
does include the groupWhile this was super confusing it meant that finally we could stop digging at the low level because Java and Linux now behaved as expected, the issue must be somewhere in setting up the process.
All of this was in a Cloudera environment. Cloudera uses "Agents" implemented in Python to communicate with the Cloudera Manager Server. These Agents are the ones receiving commands to do things like start and stop processes. So we looked into those. I already knew that Cloudera uses Supervisor to actually manage and supervise the processes.
Looking into how supervisor starts the processes we found this:
groups = [grprec[2] for grprec in grp.getgrall() if user in grprec[3]] ... os.setgroups(groups)
Ha! Supervisor sets the effective groups for a process manually. Why, I don't know, but it does.
We used a Python console to check what grp.getgrall()
returns and drumroll it is missing our hadoop
group. Finally!
This makes it easier because it's now much easier to reproduce.
So, I looked in the cpython source code and found out how getgrall()
is implemented and this is an extract:
while ((p = getgrent()) != NULL) {
getgrent
was something to google for and this time it was enough to actually get a result and our solution. This Knowledge Base article from Red Hat states that getgrent
does not return groups from LDAP when SSSD is used unless enumerate
is turned on!
This was our problem! Our groups have been defined in a central IPA instance and were not local. So they were not returned by getgrent
. Our workaround is to create the groups locally instead of centrally. We didn't want to take the overhead of enumerate
.
This was a fun day! I learned a lot.
If you have your groups in LDAP and retrieve them via SSSD then Python won't see them and subsequently supervisor won't see them which leads to processes started by supervisor not seeing them. To work around this create your groups locally or enable enumerate
in SSSD.