The Wiert Corner – irregular stream of stuff

Jeroen W. Pluimers on .NET, C#, Delphi, databases, and personal interests

  • My work

  • My badges

  • Twitter Updates

  • My Flickr Stream

  • Pages

  • All categories

  • Enter your email address to subscribe to this blog and receive notifications of new posts by email.

    Join 1,811 other followers

Hmm, named failing at start on one of the secondaries: need to investigate this further

Posted by jpluimers on 2017/05/24

I was not too happy that this just happened after updating one of the DNS secondaries:

May 24 21:29:48 laurel systemd[1]: Starting LSB: Domain Name System (DNS) server, named...
-- Subject: Unit named.service has begun start-up
-- Defined-By: systemd
-- Support: http://lists.freedesktop.org/mailman/listinfo/systemd-devel
-- 
-- Unit named.service has begun starting up.
May 24 21:29:49 laurel named[3173]: Starting name server BIND cp: cannot stat '/lib/engines': No such file or directory
May 24 21:29:51 laurel named[3235]: starting BIND 9.10.4-P5  -t /var/lib/named -u named
May 24 21:29:51 laurel named[3235]: running on Linux armv6l 4.3.3-6-raspberrypi #1 Wed Dec 16 08:03:35 UTC 2015 (db72752)
May 24 21:29:51 laurel named[3235]: built with '--prefix=/usr' '--bindir=/usr/bin' '--sbindir=/usr/sbin' '--sysconfdir=/etc' '--localstatedir=/var' '--libdir=/usr/lib' '--enable-exportlib' '--with-export-libdir=/usr/lib' '--with-export-includedir=/usr/i
May 24 21:29:51 laurel named[3235]: ----------------------------------------------------
May 24 21:29:51 laurel named[3235]: BIND 9 is maintained by Internet Systems Consortium,
May 24 21:29:51 laurel named[3235]: Inc. (ISC), a non-profit 501(c)(3) public-benefit
May 24 21:29:51 laurel named[3235]: corporation.  Support and training for BIND 9 are
May 24 21:29:51 laurel named[3235]: available at https://www.isc.org/support
May 24 21:29:51 laurel named[3235]: ----------------------------------------------------
May 24 21:29:51 laurel named[3235]: adjusted limit on open files from 4096 to 1048576
May 24 21:29:51 laurel named[3235]: found 1 CPU, using 1 worker thread
May 24 21:29:51 laurel named[3235]: using 1 UDP listener per interface
May 24 21:29:51 laurel named[3235]: using up to 4096 sockets
May 24 21:29:51 laurel named[3235]: ENGINE_by_id failed (crypto failure)
May 24 21:29:51 laurel named[3235]: error:25070067:DSO support routines:DSO_load:could not load the shared library:dso_lib.c:233:
May 24 21:29:51 laurel named[3235]: error:260B6084:engine routines:DYNAMIC_LOAD:dso not found:eng_dyn.c:467:
May 24 21:29:51 laurel named[3235]: error:2606A074:engine routines:ENGINE_by_id:no such engine:eng_list.c:390:id=gost
May 24 21:29:51 laurel named[3235]: initializing DST: crypto failure
May 24 21:29:51 laurel named[3235]: exiting (due to fatal error)
May 24 21:29:51 laurel named[3173]: ..failed
May 24 21:29:51 laurel systemd[1]: named.service: Control process exited, code=exited status=1
May 24 21:29:51 laurel systemd[1]: Failed to start LSB: Domain Name System (DNS) server, named.
-- Subject: Unit named.service has failed
-- Defined-By: systemd
-- Support: http://lists.freedesktop.org/mailman/listinfo/systemd-devel
-- 
-- Unit named.service has failed.
-- 
-- The result is failed.
May 24 21:29:51 laurel systemd[1]: named.service: Unit entered failed state.
May 24 21:29:51 laurel systemd[1]: named.service: Failed with result 'exit-code'.

It’s in fact a manifestation of [Archive.isBug 1040027 – bind (named): fails to start since the introduction of namespaced openSSL packages

A fix is in the pipeline at [Archice.isRequest 496968 – openSUSE Build Service

However, that fix never made it to Raspberry Pi B (the original Rasberry Pi 1B) because that is armv6l and the bind build for that has failed early April 2017.

That’s now in [Archive.isBug 1040697 – bind fails building for armv6l since 20170401 causing bugfixes not to make it to the wild.

–jeroen

9:48pmwiertHello all.
9:50pmwiertI just updated one of my Raspberry Pi based secondary DNS servers with zypper duplicate causing named to fail when starting with an issue that is very similar to Debian issues in 2012 and 2016: ghost not loading in a chrooted environment.
9:50pmwierterror:2606A074:engine routines:ENGINE_by_id:no such engine:eng_list.c:390:id=gost
9:50pmwiertjournalctl dmp and references to the issues are at https://gist.github.com/jpluimers/2e35edc7b3b1cd0a7edca42916518ab5
9:54pmDimStarwiert: this sounds a lot like bug 1040027
9:54pm|Anna|bug 1040027 in openSUSE Tumbleweed (Other) “bind: fails to start since the introduction of namespaced openSSL packages” [Major,New] https://bugzilla.opensuse.org/show_bug.cgi?id=1040027
9:55pmDimStarcan you apply the diff from sr 496968 on your local system? That should get it going
9:55pmjordia65 left the chat room. (Ping timeout: 240 seconds)
9:55pm|Anna|sr#496968 [submit network/bind -> openSUSE:Factory/bind] a- Fix named init script to dynamically find the location of the    openss… (State: accepted) — https://build.opensuse.org/request/show/496968
9:55pmDimStar(especially the part from /etc/init.d/named )
9:56pmwiertIt is indeed what I was searching for but couldn’t find.
9:56pmwiertsearching for named will get you here with only one entry: https://bugzilla.suse.com/show_bug.cgi?id=named
9:57pmwiertsearching for the error lines didn’t reveal any results. Is it OK if I add my journalctl output in the bug?
9:57pmwiertI’ll probably need some help applying the diff. Not yet proficient enough with diff on the commandline.
9:58pmDimStarwiert: sure – if it helps others to discover it bettert
9:58pmDimStarwiert: yeah – I just see that the diff is a bit nasty – as it’s pre-processed during build.. so the easiest will be to just manually edit /etc/init.d/named
9:58pmwiertOK. I will try editing /etc/init.d/named as I’ve etckeeper running I can always revert when I screw up.
9:59pmDimStarthen around line 185 you have a mkdir /usr/lib64 call.. and below a cp command…
10:00pm}-Tux-{ joined the chat room.
10:00pm}-Tux-{ left the chat room. (Changing host)
10:00pm}-Tux-{ joined the chat room.
10:00pmDimStarerm.. it’s actually /lib64 in the original file..
10:00pmwiertI think lines 21-25 of my gist will suffice. Is that OK for you?
10:01pmDimStaryes, that should be ok for the bug
10:02pmrobby_ joined the chat room.
10:09pmwiertI’ve added those and now looking at the diff for vendor-files.tar.bz2/init/named
10:10pmDimStarwiert: ok.. let me paste you how it should be correctly (at around line 180)
10:10pmwiertI think I can do this from my joe text editor.
10:10pmwiertis old fart use to WordStar keybindings
10:11pmDimStarcurrently you only have /lib64 stuff there – and that should be /usr/lib64/openssl-1_0_0/engines
10:12pmDimStarremembers that the x86_64 snapshot with this issue was actually blocked from being released
10:13pmmkubecek left the chat room.
10:19pmwiertactually, none of my Raspberry Pi systems (including the ones not yet updated) have /var/lib/named/usr at all.
10:20pmwiertan x64 version has /var/lib/named/usr/lib64/openssl-1_0_0/engines with CPE_NAME=”cpe:/o:opensuse:tumbleweed:20170522
10:20pmDimStar~that’s right – this is ‘new’ since openSSL moved from /lib to /usr/lib – so up to now the engines were in /var/lib/named/lib64/engines – now they need to be copied to /var/lib/named/user/lib64/openssl-1_0_0/engines
10:21pmwiertthe x64 system also has /var/lib/named/lib64/engines/
10:21pmwiertso there the .so files are present in two directories
10:21pmDimStarI’m not sure why the bind package on ARM would not have followed that.. what CPE_NAME do you have there?
10:21pmwiertARM has CPE_NAME=”cpe:/o:opensuse:tumbleweed:20170521
10:21pmDimStaryeah – the /var/lib/named/lib64/engines are no longer in use – but there is no cleanup code there
10:22pmDimStargrmbl – the checmin of bind was on the 20th..
10:23pmwiertI assume that the fix is in the x64, but not the ARM version.
10:23pmDimStarARM inherits all packages from openSUSE:Factory – it’s ‘just’ a bit slower with building/testing which is why there are less frequent snapshots
10:24pmwiertIndeed: openSUSE_Tumbleweed -> x86_64 succeeded
10:24pmDimStarbut if 0521 went out, that means it must have completed the build – which would imply the updated bind too
10:24pmwiertand openSUSE_Factory_ARM -> aarch64 blocked; armv7l -> blocked
10:25pmDimStarthat’s the current situation – not when 0521 was released :)
10:25pmDimStarthe Pi is aarch64, right?
10:25pmDimStarcan you check rpm -q –changelog bind | head ?
10:27pmwiertx64: * Sat May 20 2017 dimstar@opensuse.org
10:27pmwiertARM: * Sat Feb 18 2017 kukuk@suse.com
10:27pmDimStarWTF… osc rbl openSUSE:Factory:ARM bind standard aarch64 => this build was complete on Sat May 20 16:30:13 UTC 2017 – way before 0521 was even started
10:29pmDimStarand then please: rpm -q –qf “%{disturl}\n” bind
10:31pmDimStarthen I really don’t get how this snapshot got published – this makes no sense at all…
10:31pmDimStarlet ‘s look a bit more: rpm -q bind => on aarch64 I see 9.10.4P5-26.1
10:32pmwiertbind-9.10.4P5-24.1.armv6hl
10:32pmwiertah, that shows the architecture.
10:32pmDimStarok.. let me look into armv6hl then :)
10:32pmwiertI think I forgot to tell it’s a Raspberry Pi B plus.
10:33pmDimStarosc rbl openSUSE:Factory:ARM bind standard armv6l => that makes more sense: build failed on May 20th
10:34pmDimStarwiert: wouldn’t have helped me at all – I have no idea about the Pi’s and what arch they run :)
10:34pmDimStarthe sad news: osc jobhist openSUSE:Factory:ARM bind standard armv6l
10:34pmDimStarit implies it was not my change breaking ‘bind’ from building – this already failed on April 1st (it wasn’t me – I was on vacation then :P )
10:35pmwiertI ran `arch` which shows armv6l
10:35pmwiertwhat’s the difference between armv6l and armv6hl (the suffix in bind-9.10.4P5-24.1.armv6hl)
10:35pmfvogthardfloat
10:36pmwiertand in more laymen terms?
10:36pmfvogtSupports the FPU, which means it’s a different ABI
10:37pmfvogtYou can’t use it on a armv6l system
10:37pmrobby_ left the chat room. (Quit: Konversation terminated!)
10:37pmwiertso why does `arch` show `armv6l` and not `armv6hl` ?
10:38pmwiertA Raspberry Pi 2B in my collection shows armv7l
10:38pmfvogtThe kernel doesn’t care, it’s the userspace
10:38pmwiertbut RPM shows bind-9.10.4P5-25.1.armv7hl
10:39pmwiertI’m confused as  on x64 systems both are x86_64
10:39pmwiertso I was kind of expecting ARM systems to both be the same as well
10:40pmwiertAm I right the conclusion so far is this:
10:41pmwiert– Raspberry Pi 1 has broken bind build, so the patch doesn’t work
10:41pmwiert– Raspberry Pi 2 and x64 are OK
10:42pmDimStaralmost – the patch ‘would’ work, but as bind fails for completley different reasons to build, no binary is produced
10:42pmDimStarfvogt: does that ring a bell to you:  147s] contrib/dlz/config.dlz.in:31: warning: underquoted definition of DLZ_ADD_DRIVER
10:43pmwiertI think that’s what I meant: the bind build was broken earlier, so the patch that got submitted later never got integrated into the distribution
10:43pmDimStarbut ew have 2017 :P
10:44pmDimStarwiert: yes, in this case that statement would be true – somebody has to fix build of bind of armv6l
10:44pmwiertOK. Is that in the pipeline somewhere or should I submit a bug because of this?
10:45pmDimStarwiert: I’d submit an extra bug for bind/armv6l – the bug to address the issue which you acyally SEE is the one I linked / for which the patch is there; the fact that bind fails to build is a deeper rooted one
10:45pmDimStarand in the view I have, there is no incoming fix for bind at the moment
10:46pmwiertwhere is the URL I can see that bind fails to build for armv6l ?
10:46pmwiert(in which I can see)
10:47pmlurchi__ is now known as lurchi_.
10:48pmwiertI’ll submit a new bug to solve that build output and refer to the existing bug indicating that because the build fails, the existing bug isn’t in the current CPE_NAME
10:51pmmjwolf left the chat room. (Quit: Leaving)
10:52pmwiertCan I keep the same priority/severity as 1040027 ?
10:52pmDimStarseverity yes – prio you should leave unchanged at P5 initially
10:52pmlurchi_ is now known as lurchi__.
10:54pmVojtaeus joined the chat room.
10:54pmwiertOK.
10:54pmwierthttps://build.opensuse.org/package/live_build_log/openSUSE:Factory:ARM/bind/standard/armv6l only shows it fails, but not since when. Where can I find the history?
10:58pmrobby_ joined the chat room.
11:02pmrobby_ left the chat room. (Ping timeout: 255 seconds)
11:06pmsrinidhi left the chat room. (Ping timeout: 255 seconds)
11:06pmwiert@DimStar: can you check the text of the first file in https://gist.github.com/jpluimers/2e35edc7b3b1cd0a7edca42916518ab5 as that’s what I want to put in the report.
11:06pm|Anna|Error: “DimStar:” is not a valid command.
11:07pmDimStarwiert: sounds reasonable
11:08pmDimStarwolfiR: cross your fingers: openQA just started the next test with the new disk images
11:15pm|Anna|bugzilla.suse.com bug 1040697 in openSUSE Tumbleweed (Other) “bind fails building for armv6l since 20170401 causing bugfixes not to make it to the wild” [Major,New]

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

 
%d bloggers like this: