Tuesday 30 May 2017

Using kdump to analyze / capture kernel crashes with CentOS / Fedora / RHEL 7

kdump is a utility to help you capture a memory dump of the system when the kernel crashes.

kdump reserves a small portion of the memory for a 'crash kernel' that is invoked (as the name implies) when a system crash occurs - it's sole purpose is to get a memory dump of the system at that point in time.

kdump comes as part of the kexec-tools package - so in order to install we will issue:

sudo dnf install --enablerepo=fedora-debuginfo --enablerepo=updates-debuginfo kexec-tools crash kernel-debuginfo

if you prefer to do the analysis on another server we can simply issue:

sudo dnf install kexec-tools

Note: The kernel-debuginfo package provides you with the necessary tools to debug the kernel vmcore.

We will need to enable the crash kernel in our grub config - as a one of we can find the linux line on the relevant boot entry in /boot/grub2/grub.cfg and add:

crashkernel=128M

or we can make it persistent and add it to:

/etc/default/grub

and prepending it to the 'GRUB_CMDLINE_LINUX' variable e.g.:

GRUB_CMDLINE_LINUX="crashkernel=128M rd.lvm.lv=fedora/root rd.lvm.lv=fedora/swap rhgb quiet"

Warning: 'crashkernel=auto' - the auto value is does not work with Fedora!

and then regenerate the grub configuration with:

sudo grub2-mkconfig -o /boot/grub2/grub.cfg

We can also define where the dumps are stored - by default this is on the local filesystem under /var/crash - however we can (and should ideally) house this on external sources like a file share - to do this we can edit the kdump config file:

sudo vi /etc/kdump.conf

and uncomment the relevant options.

We can also enable the 'core_collector' feature that will compress our dumps for us - however on Fedora 25 this is already enabled for us - if not uncomment the following line:

core_collector makedumpfile -l --message-level 1 -d 31

Now restart your system and then enable / start the kdump service with:

sudo systemctl enable kdump && sudo systemctl start kdump 

We can then trigger a kernel panic using SysRq - using the proc filesystem:

echo 1 > /proc/sys/kernel/sysrq
echo c > /proc/sysrq-trigger

We can then inspect the dump with:

crash /var/crash/127.0.0.1-2017-05-30-14\:02\:00/vmcore /usr/lib/debug/lib/modules/`uname -r`/vmlinux

Determine when a server was last rebooted / shutdown

Using the utility 'last' we can quickly identify when a server was rebooted or shutdown and by whom - of course you can do this by digging through syslog - however this provides a quick alternative.

We should issue:

sudo last -wxF

Example output:

user2   pts/0        5.6.7.8     Tue May 30 09:28:11 2017   still logged in
runlevel (to lvl 2)   3.XX.X-XX-generic Mon May 29 22:49:26 2017 - Tue May 30 09:44:59 2017  (10:55)
reboot   system boot  3.XX.X-XX-generic Mon May 29 22:49:26 2017 - Tue May 30 09:44:59 2017  (10:55)
user   pts/2        1.2.3.4     Mon May 29 08:37:32 2017 - Mon May 29 10:48:27 2017  (02:10)
user   pts/2        1.2.3.4     Thu May 25 12:37:50 2017 - Thu May 25 17:42:15 2017  (05:04)
user   pts/2        1.2.3.4     Mon May 15 08:38:54 2017 - Mon May 15 10:44:46 2017  (02:05)
user   pts/2        1.2.3.4     Tue May  9 08:32:44 2017 - Tue May  9 10:34:29 2017  (02:01)
user   pts/3        1.2.3.4     Tue May  9 08:28:45 2017 - Tue May  9 10:31:14 2017  (02:02)
user   pts/3        1.2.3.4     Tue May  9 07:51:54 2017 - Tue May  9 07:51:56 2017  (00:00)
user   pts/3        1.2.3.4     Tue May  9 07:49:07 2017 - Tue May  9 07:50:53 2017  (00:01)
user   pts/2        1.2.3.4     Tue May  9 07:22:15 2017 - Tue May  9 08:32:39 2017  (01:10)
user   pts/2        1.2.3.4     Mon May  8 05:50:01 2017 - Mon May  8 07:59:39 2017  (02:09)
user   pts/2        1.2.3.4     Fri May  5 10:57:47 2017 - Fri May  5 13:06:11 2017  (02:08)
user   pts/2        1.2.3.4     Thu May  4 11:33:05 2017 - Thu May  4 13:36:59 2017  (02:03)
user   pts/2        1.2.3.4     Thu May  4 09:10:16 2017 - Thu May  4 11:19:01 2017  (02:08)

Factory reset / wipe a 2960/X switch

The easiest way to wipe a 2960/X switch is to boot the switch into recovery mode by powering of the switch, then hold down on the mode button while you power the switch back on.

The boot sequence should be considerably quicker then usual and you should be prompted with a 'switch:' prompt.

In order to access the flash memory we will need to firstly instruct the switch to initialise the filesystem with:

flash_init

and then delete the following:

delete flash:/config.text

and

delete flash:/vlan.dat

And anything else like config backups etc.

Sources
 http://www.cisco.com/c/en/us/support/docs/switches/catalyst-2950-series-switches/41845-192.html

Friday 26 May 2017

Fixed: %SW_DAI-4-PACKET_RATE_EXCEEDED: XX Packets received in XXX milliseconds on Gi0/X

%SW_DAI-4-PACKET_RATE_EXCEEDED: XX Packets received in XXX milliseconds on Gi0/X
%PM-4-ERR_DISABLE: arp-inspection error detected on Gi0/X, putting Gi0/X into err-disable state.

This above problem arises when an interface configured with DAI (Dynamic ARP Inspection) receives an amount of ARP requests over its threshold (which by default is 15.)

On busy networks this will often need to be increased to 100 or even more. Specific applications such as the Bonjour-Service (which iTunes utilises) will also greatly increase the number of arp requests.

When the switch detects the interface has gone over the quota it will put the interface into an err-disable state.

To bring back the up interface we can clear the error or simply bring the interface up and down:

int gi0/10
shutdown
no shutdown

In order to increase the limit we should issue the following (per interface):

int range gi0/1-46
ip arp inspection limit rate 100
do wri mem

Thursday 25 May 2017

Setting up QoS on the Cisco 2960X / 3650-X

Cisco IOS provides QoS on both layer 2 (that is handled by cos) and layer 3 (ToS / DSCP).

CoS (Class of Service) is a 3 bit field that is present in an Ethernet frame header when 802.1q trunking is in place - to illustrate:


A priority value from 0 to 7 is set in the field - meaning the higher the priority, the more urgently the switch will ensure a low latency, expedited delivery of the frame (providing the switch is setup with QoS enabled!)

VoIP phones are often considered to have a priority of 5 - however this is not set in stone. Some common mappings can be found below:


ToS (Type of Service) is an 8 bit field that is part of an IP packet. There are two common QoS methods - one being 'IP Precedence' (the older method) that uses 3 bits of the field and DSCP (the newer / preferred method) that uses 6 bits in the field - as is illustrated below:



It is also worth bearing in mind that DSCP is backward compatible with IP Precedence.

The same priority mapping can be applied to ToS as above.

Trust Boundary 

In QoS a trust boundary is simply the point at which (in your network) that you trust CoS or DSCP priorities from inbound packets / frames. For example a user might purposely mark DSCP priority of IP packets from their computer - however the switch (by default) will ignore and strip the information of the packet. In order for the computer to be trusted you would need to issue something like:

mls qos # turn on qos
int fa0/1
mls qos trust cos # trust the interface

or

mls qos trust cos pass-through # this will also ensure any existing DSCP values are not ovewritten by the switches CoS to DSCP map!

It is also important to note that this also applies to upstream switches - for example if I wanted to ensure priority delivery of the frame when it hits the core switch I would need to ensure that the same configuration is applied on the uplink port on the core switch.

CoS -> DSCP and DSCP - CoS Mappings

These mapping provide a way to ensure that a (for example) frame that is marked with a CoS priority value of 5 will also have an equivalent DSCP value when it hits a layer 3 device / router.

Although these mappings can be customised - the default mappings can be found below:


As an example:

mls qos map cos-dscp 0 8 16 24 32 40 48 56

or the other way around (dscp to cos):

mls qos map dscp-cos 16 18 24 26 to 1 


QoS Example: Prioritising SSH traffic on a Cisco 2960-X switch

Unfortunately on the 2960X you are unable to classify specific TCP / UDP protocols - so instead we have to define address ranges / ports in order to categorise traffic we wish to apply QoS to.

Let's firstly turn on QoS with:

mls qos

and define our CoS to DSCP map:

mls qos map cos-dscp 0 8 16 24 32 40 48 56

Now we will set the CoS priority on our traffic - in some cases end user devices such as VoIP phones can do this for us - however for the purposes of completeness I will perform this on the switch instead.

In this example I want to ensure that all traffic from any source to a particular IP (1.2.3.4) is marked with a CoS priority of 5:

access-list 123 permit ip any 1.2.3.4 0.0.0.0

class-map match-all CM-CALLSRVER
match access-group 123

policy-map PM-CALLSERVER
class CM-CALLSRVER
set ip dscp 40

and apply the policy map to the interface connected to the VoIP phone:

int fa0/10
service-policy input PM-CALLSERVER

We can verify that traffic is being matched with:

show policy-map int fa0/10

Let's now instruct the switch what to do with traffic that we have tagged:

Note: On the 2960-X there are a total of two input queues and four output queues - for more information about them please refer to this article. By default queue 2 is the priority queue on the 2960-X

mls qos srr-queue input dscp-map queue 1 threshold 3 0 8 16 32

mls qos srr-queue input dscp-map queue 2 threshold 3 40 48 56

mls qos srr-queue input cos-map queue 1 threshold 3 0 1 2 3 4

mls qos srr-queue input cos-map queue 2 threshold 3 5 6 7

mls qos srr-queue output cos-map queue 4 threshold 3 0 1

mls qos srr-queue output cos-map queue 3 threshold 3 2 3

mls qos srr-queue output cos-map queue 2 threshold 3 4

mls qos srr-queue output cos-map queue 1 threshold 3 5 6 7

mls qos srr-queue output dscp-map queue 4 threshold 3 0 8

mls qos srr-queue output dscp-map queue 3 threshold 3 16 24

mls qos srr-queue output dscp-map queue 2 threshold 3 32

mls qos srr-queue output dscp-map queue 1 threshold 3 40 48 56

Note: By default when QoS in enabled - ingress traffic will be marked with CoS 0 / DSCP 0 (or the traffic is not marked in the first place) - unless you 'trust' the interface e.g.:

int fa0/10
mls qos trust cos
mls qos trust dscp
spanning-tree portfast
switchport mode access
swichport access vlan 123

Sources



  

Wednesday 17 May 2017

Cisco Switch Template v1.0: Ports, Security / Hardening, Features

Global Configuration

Spanning Tree

Configure the root bridges for your VLANs:
conf t
spanning-tree vlan 123 root primary

For access switches we can utilize uplink fast:
spanning-tree uplinkfast


Port Configuration

PC Client Access Port:

Features: Sticky Ports, DAI, IP Source Guard, Storm Control

int gix/y/z
ip addr 1.2.3.4 255.255.255.0
switchport mode access 
switchport access vlan 123
switchport port-security
switchport port-security mac-address sticky
spanning-tree portfast

spanning-tree bpduguard enable spanning-tree bpdufilter enable
OR
spanning-tree guard root

switchport nonegotiate
no cdp enable
storm-control broadcast level 10.00
storm-control action trap || shutdown

Trunk Port:

int gix/y/z
switchport mode trunk
switchport trunk encapsulation dot1q
switchport trunk native vlan 1000
switchport trunk allowed vlan 10,11,12,13
switchport nonegotiate
storm-control broadcast level 10.00
storm-control action trap OR shutdown
no cdp enable

with aggregation:

channel-protocol lacp
channel-group 10 mode active
port-channel load-balance dst-ip | src-ip etc.


and optional - 

spanning-tree portfast trunk // in circumstances where link aggregation is in place without LACP or PAgP.
spanning-tree guard root // where necessary to prevent another switch taking root
spanning-tree bpduguard enable // to prevent rouge switches from joining your network

Quality of Service

Limiting switch port ingress and egress traffic:

mls qos

ip access-list extended ACL_ALLTRAFFIC
permit ip any any

class-map match-all CLASS_ALLTRAFFIC
  match access-group name ACL_ALLTRAFFIC

policy-map POLICY_ALLTRAFFIC
  class CLASS_ALLTRAFFIC
    police 1250000 12500000 exceed-action drop

interface GigabitEthernet0/2
service-policy input POLICY_ALLTRAFFIC
srr-queue bandwidth limit 90

* The policy map 'POLICY_ALLTRAFFIC' allows a normal ingress operational speed of 10mbps and a burst rate of 100mbps. *

* The srr-queue limit statement is worked out as follows - for example if you have an interface speed of 1 Gigabit and you limit it to 90 - this then provides the end device 10% of the available bandwdith - in this case 100 Mbit. *

Setting up CoS / DSCP:

http://blog.manton.im/2017/05/setting-up-qos-on-cisco-2960x-3650-x.html


Services Configuration

SSH / AAA:

conf t
username test privilage 15 secret $tr0ngPa$$w0rd!
aaa new-model
aaa authentication login default local
line console 0
login authentication default
line vty 0
login authentication default
ip domain-name yourdomain.local
crypto key generate rsa modulus 2048
ip ssh version 2
line vty 0
transport input ssh
ip access-list standard mgmt-ssh
10 permit <management-subnet> <management-wildcardmask>
20 deny any log
line vty 0
access-class mgmt-ssh in

VTP:

vtp domain mydomain.internal
vtp version 3
vtp mode transparent // to reset revision number
vtp mode server OR client

vtp password xyz

SNMP v3:

ip access-list standard mgmt-snmp
1 permit 10.0.0.0 0.0.0.255
10 deny any log
snmp-server group snmp v3 auth access mgmt-snmp
snmp-server user snmp snmp v3 auth md5 <password>
snmp-server host 10.0.2.75 version 3 auth snmp
snmp-server enable traps snmp linkdown linkup coldstart warmstart

SNMP v2c:

ip access-list standard mgmt-snmp
1 permit 10.0.0.0 0.0.0.255
10 deny any log
snmp-server view SNMPView iso included
snmp-server community <community-name> view SNMPView RO mgmt-snmp
snmp-server host <remote-server> version 2c <community-name>
snmp-server enable traps snmp linkdown linkup coldstart warmstart

Remote Logging:

logging 1.2.3.4 // syslog server
logging buffered 64000 debug

NTP:

ntp server 81.168.77.149 prefer
ntp server 194.164.127.6
ntp server 194.164.127.4

RADIUS:

radius server <friendly-name>
address ipv4 <ip-address>
key <shared-secret>

aaa new-model
aaa authentication login default group radius local
aaa authorization exec default group radius local if-authenticated

aaa accounting system default start-stop group radius

Hardening:

no ip http server
no ip http secure-server
no ip domain-lookup
no service dhcp
no service pad
line vty 0
exec-timeout <minutes>
service tcp-keepalives-in
service tcp-keepalives-out

DHCP Snooping:
ip dhcp snooping
ip dhcp snooping vlan 100
# trust a server / port
int gi0/15
desc DHCP_Server
ip dhcp snooping trusted
no ip dhcp snooping information option # if using non-cisco DHCP server

Dynamic ARP Inspection
ip arp inspection vlan 100
show ip arp inspection vlan 100
# trust uplink interface
int g0/15
ip arp inspection trust
exit
ip arp inspection log-buffer entries 512
int range gi0/1-48
ip arp inspection limit rate 100

IP Source Guard
int gi0/4
ip verify source
# exclusions

ip source binding 1111.2222.3333 vlan 100 1.2.3.4 interface gi0/20

Tuesday 16 May 2017

Setup VTP (VLAN Trunking Protocol) on Cisco Devices

VTP (VLAN Trunking Protocol) is a way of distributing VLAN information across multiple switches in your network. Although VLAN's are local to each switch using VTP enables you a quick and painless way of adding, removing and modifying VLAN's.

VTP (of course) will only work on trunked ports - however by default all VLAN information

There are three modes (only 2 in VTP version 1 and 2) in VTP version 3:

server: This is the authoritative node that decides which VLANs will be created, deleted etc.

client: This mode listens and relays VTP messages - however is unable to add / delete VLANs from the domain.

transparent: This mode ignores incoming VTP messages - however does pass them on to neighbours.

off: This mode (only available in version 3) completely ignores VTP messages.

On switch one (the vtp server) we will define our domain:

vtp domain mydomain.internal

and the VTP version - along with the mode:

vtp version 3
vtp mode server

We can also (optionally) set a password with:

vtp password xyz

To review our configuration we should run:

do show vtp status

Here we can also identify which VTP revision number we are on.

Now on the second (VTP client) switch - we'll sort out the domain and version again:

vtp domain mydomain.internal

Note: If you do not specify a VTP domain (null by default) and the switch receives a VTP message - it will automatically configure the switch with the messages VTP domain!

Important: Ensure that the VTP server (Switch 1) has all of the relevant VLAN's that are already configured on Switch 2 - otherwise these will be lost and the links will go down when VTP is turned on!

Important: Before we go any further we need to ensure that Switch 2's VTP revision number is not higher than that of Switch 1 - otherwise this could be disastrous! This is because Switch 1 will think Switch 2 has a newer configuration and overwrite it's own VLAN database (vlan.dat).

However this typically won't happen with new switches - but if it's already in use you should check the revision number with:

do show vtp status

and if it's higher (or the same as) Switch 1 we'll need to reset the revision number by putting the switch's VTP instance into transparent mode:

vtp mode transparent

and then into the desired mode:

vtp mode server

or

vtp mode client

If all goes to plan you should typically not have any downtime on your trunks - however with anything like this I'd strongly recommend scheduling a maintenance period!

Tip: If you wish to disable VTP on an interface (this will prevent inbound VTP messages reaching the switch interface) you can issue:

int gix/y
no vtp

Or if you are connecting another switch and want to ensure that it does not join the VTP domain you can issue:

vtp mode off

Debugging Activesync connections to a specific mailbox

Firstly enable the debugging on the desired mailbox with:

Set-CASMailbox <mailbox-name> -ActiveSyncDebugLogging:$true

Wait / replicate the problem and then generate a report with:

Exchange 2007 - 2010

Get-ActiveSyncDeviceStatistics -Mailbox alias -GetMailboxLog:$true -NotificationEmailAddress [email protected]

Exchange 2013+

Get-MobileDeviceStatistics -Mailbox alias -GetMailboxLog:$true -NotificationEmailAddresses [email protected]

Finally disable debugging when you are done (as these logs can be very verbose and consume a fiar bit of storage quite quickly!)

Set-CASMailbox <mailbox-name> -ActiveSyncDebugLogging:$false

When you receive the reports you should get something like:

Exchange ActiveSync Mailbox Logs - <MailboxName>


Date: 5/16/2017 11:30:09 AM

User name: <MailboxName>

Device type: iPhone
Device ID: 123456789ABCDEFGHIJKL

Device type: iPhone
Device ID: ABCDEFGHIJKL123456789
We can also quickly identify the user associated with the device with:

Get-ActiveSyncDevice | Where {$_.DeviceId -Match "123456789ABCDEFGHIJKL"} | Select UserDisplayName

Monday 15 May 2017

UDLD (Unidirectional Link Detection)

UDLD (Unidirectional Link Detection) is a layer 2 Cisco propitiatory protocol that ensures that a unidirectional link is not present.

While copper / UTP (by design) does not suffer from unidrectional links (and hence is rarely used) - fiber does on the other hand - if the RX (or TX) fiber goes you are left with a unidirectional link.

UDLD sends heartbeats down the fiber in order to detect whether a unidrectional link is present.

Although UDLD can come in useful (with copper) when you have a single / non-aggregated port (not using LACP etc.) and there is a medium device between itself and the endpoint - e.g. media converter or WAN optimiser.

By default UDLD is not enabled - in order to turn it on we can issue:

int gi1/0/1
udld port

There is also an aggressive option - that forces UDLD to be enabled on the other end as well:

int gi1/0/1
udld port aggressive

Wednesday 10 May 2017

Configuring port aggregation and trunking with a Cisco 2960 and ESXI

There seems to be a fair bit of mis-information out there regarding ESXI's ability to work with link aggregation.

Firstly LACP support is only available with 5.1 and upwards and this must be done via a Distributed Switch (which means your licensing costs will likely sky rocket) or the Cisco Nexus 1000v virtual switch.

So in what I assume the majority of cases (i.e. if you don't have Enterprise Plus licensing) - you will have to use static link - i.e. doesn't use LACP or PAgP - this has many disadvantages though some of the main ones being:

- No protection against misconfiguration e.g. switching loops.
- Failover - if a 'dumb' device is sitting in between e.g. a WAN optimizer or similar - and its uplink dies; a static configuration won't detect this and will keep sending traffic down the line!

In my case (unfortunatly) I am forced to go with the static configuration - so we'll firstly configure the agreggate ports on our switch:

int range gi1/0/15, int gi2/0/15
no shutdown
switchport mode trunk
switchport trunk encapsulation dot1q
switchport trunk native vlan 999
switchport trunk allowed vlan 2,3,4
switchport nonegotiate
spanning-tree portfast edge trunk
channel-group 15 mode on

And then on the vSphere GUI go to:

Configuration >> Networking >> Properties >>

Create a new vSwitch - ensuting that both adapters are added to it.

Proceed by clicking 'Edit' on the new Virtual Switch and click on the 'NIC Teaming' tab.

Set the 'Load Balancing' context to 'Route based on ip hash' and finally ensure that both nic's are under the 'Active Adapters' view.

Tuesday 9 May 2017

Removing old kernels from grub with CentOS 7 / RHEL

After a while you can easily accumulate a fair few kernels and this can become a pain when using grub.

In order to remove them from grub (and also from our system) we should firstly identify them with:

rpm -qa | grep '^kernel-[0-9]'

We should also ensure we are aware of which kernel we are using(!) with:

uname -r

And then remove the kernel with dnf / yum:

sudo yum remove kernel-4.8.6-300.fc25.x86_64

If you are using a Debian based distribution you'd need to run 'update-grub2' here - however with RHEL based OS's the same function is effectively invoked as a script when you install / remove the kernel rpm.

To confirm the config (before committing):

grub2-mkconfig -o "$(readlink /etc/grub2.conf)"

and then to commit:

grub2-mkconfig -o "$(readlink /etc/grub2.conf)" > /boot/grub2/grub.cfg

* Note: The above is executed within a script when installing new kernels from rpm's *

For some reason I also had to manually delete them after (in addition to yum remove) to get grub generating the correct config:

rm /boot/initramfs-4.8.6-300.fc25.x86_64.img
rm /boot/vmlinuz-4.8.6-300.fc25.x86_64

and then ended up running the configuration script again:

grub2-mkconfig -o "$(readlink /etc/grub2.conf)" > /boot/grub2/grub.cfg

Limiting the amount of installed kernels with yum

For future yum can actually handle the above for you by limiting the amount of kernels that can be installed at one time - firstly install the yum utilities package with:

yum install yum-utils

and then add the following line to the bottom of /etc/yum.conf

package-cleanup --oldkernels --count=2

Shutting down / booting up a EMC Unity San

To save me (and hopefully others) digging through the several hundred page manual - the following procedures demonstrate how to safely bring a Dell / EMC Unity SAN up / down in a controlled fashion.

Shutting down process

Presuming you have access to the controller web interface - firstly go to 'Service' in the navigation bar under the 'System' node.

Note: You will need your 'service password' to perform this - this is specified during the intial setup wizard.

Click on the 'Service Tasks' tab and under the 'Storage System' view select 'Shut Down Storage System' and hit execute.

The shutdown process can vary greatly and take between (in my experience) 15 to 30 minutes to complete fully.

You can easily confirm when the storage processors have shutdown by checking whether the lights PSU lights are BOTH solid amber and green, the SP Status Fault LEDs are solid amber, the network configuration LEDs are on and all other LEDs are off.

Remove the power leads from the SP's and then any DAE's.

Booting up process

Firstly re-connect the pwoer lead to the DAE's in accending order - like follows:

DAE 0
DAE 1
DAE 2
and so on...

Confirm the blue LED on the front of the DAE(s) are solid.

Then re-connect the power cables for the SP's in the following order (important):

SP A
SP B

This process can take (in my experience) between 15 - 25 minutes.

Wednesday 3 May 2017

Creating / installing a simple Linux kernel module

This tutorial will explain how we can create a simple Linux kernel module (hello world style) and how it can be installed.

When building Linux kernel modules you require the kernels header files. Header files in general define the how functions in the source are defined - you do not require the actual implementation of the functions (as this would introduce a lot unneeded code) - rather the function signature is provided - that is the return value and parameters of the function.

For example this might be:

int myFunction(char param1[], char param2[]);

Headers are typically used when compiling device drivers - while (obviously) you'd require the full sources if you were compiling a kernel.

Let's start by firstly installing the kernel headers for our current kernel:

sudo yum install kernel-devel kernel-headers

And then let's create a simple 'hello world' module (credit to tldp.org):

/*
 *  hello-1.c - The simplest kernel module.
 */
#include <linux/module.h> /* Needed by all modules */
#include <linux/kernel.h> /* Needed for KERN_INFO */

int init_module(void)
{
printk(KERN_INFO "Hello world 1.\n");

/*
* A non 0 return means init_module failed; module can't be loaded.
*/
return 0;
}

void cleanup_module(void)
{
printk(KERN_INFO "Goodbye world 1.\n");
}

And then creating a makefile for it:

vi Makefile

and adding:

obj-m += hello-1.o

all:
make -C /lib/modules/$(shell uname -r)/build M=$(PWD) modules

clean:
make -C /lib/modules/$(shell uname -r)/build M=$(PWD) clean

and then compiling with:

make all

You should see something like the following:

make -C /lib/modules/4.10.5-200.fc25.x86_64/build M=/home/limited/test modules
make[1]: Entering directory '/usr/src/kernels/4.10.5-200.fc25.x86_64'
  Building modules, stage 2.
  MODPOST 1 modules
make[1]: Leaving directory '/usr/src/kernels/4.10.5-200.fc25.x86_64'

Locate the .ko (Kernel Object) file - this is the module that we will load into the kernel - we can view information about it with:

modinfo hello-1.c

This is useful for verifying that the module has been built for the correct kernel (by inspecting the 'vermagic' output.)

We can install the module with:

insmod /home/limited/test/hello-1.ko

* Warning * At the time of writing Fedora currently has a bug report open which I ran into while attempting to install the module -see: https://bugzilla.redhat.com/show_bug.cgi?id=1426741

[ 4192.599748] hello_1: loading out-of-tree module taints kernel.
[ 4192.599750] hello_1: module license 'unspecified' taints kernel.
[ 4192.599750] Disabling lock debugging due to kernel taint
[ 4192.599784] hello_1: module verification failed: signature and/or required key missing - tainting kernel
[ 4192.600029] Hello world 1.

You should see the hello world message - but also you might encounter a message complaining 'module verification failure' - this is an option on the kernel that requires any additional modules that are installed are signed - otherwise they will fail to load.

Sources

The Linux Kernel Module Programming Guide: http://www.tldp.org/LDP/lkmpg/2.6/html/x121.html

Monday 1 May 2017

Building a minimal (64bit) Linux system with a vanilla kernel on the Raspberry Pi 3

One of the first things I wanted to do with the new Raspberry Pi (3) was create my own (simple) Linux distribution.

The Pi 3 is based on a BCM2837 SoC - which has a 64-bit ARMv8 CPU (opposed to ARMv7 in the Pi 2) - however the upstream kernels provided from the Pi Foundation are unfortunately all 32-bit - so for this tutorial I will concentrate on providing a 64-bit kernel so we can fully utilise its power.

We will firstly need to understand how the boot process works with Raspberry Pies - as unlike a normal desktop computer which use a BIOS to initiate a bootloader such as Grub - instead the Rasperry Pi has a closed source firmware in the SoC (System on a Chip).

This firmware is read-only / can't be modified in any way - this enables the second-stage bootloader to be read from a FAT32 formatted partition on the SD-Card.

The second-stage bootloader (bootcode.bin) is used to retrieve and program the GPU firmware (start.elf) from the SD-Card, as well as starting up the CPU. There is also a additional file called fixup.dat that configures the SDRAM between the GOU and CPU.

A kernel is then loaded - by default (on the Rapsberry Pi 3) this is named either kernel7.img (32 bit) or kernel8.img (64 bit) and is a Linux kernel - however of course this doesn't necessarily have to be.

The three files above (bootcode.bin, fixup.dat and kernelX.img) are required as a minimum in order to get the Pi up and running.

For a more detailed overview of how the boot process works please see this article.

The Pi Foundation maintains its own kernel tree for the Pi - which as of right now is 4.9 - however the mainline kernel version also works pretty well too!

To start with lets firstly obtain the latest vanilla / mainline kernel - which at this moment is 4.11 - we can download this from here:

https://cdn.kernel.org/pub/linux/kernel/v4.x/testing/linux-4.11-rc8.tar.xz

and then cross-compile it - I am going to be using Fedora for this - however a lot of people also do this on Debian / Ubuntu:

mkdir /home/user/workspace
wget https://cdn.kernel.org/pub/linux/kernel/v4.x/testing/linux-4.11-rc8.tar.xz
tar zxvf linux-4.11-rc8.tar.xz

now let's also ensure that we are going to have the relevant utilities to compile the kernel:

yum groupinstall "Development Tools" "Development Libraries" aarch64-linux-gnu-gcc

Ensure that the kernel .config file is clean / in it's default state with:

make mrproper

The kernel config files (defconfig) are located within:

arch/arm64/configs

Within the Pi Foundation upstream kernel tree you can get hold of bcmrpi_defconfig - which as it stands seems to be the most stable configuration - however as i'm trying to make this generic as possible I am going to use the default defconfig for ARM64.

make ARCH=arm64 CROSS_COMPILE=aarch64-linux-gnu- defconfig

We should also backup the .config file we have generated so we don't lose it next time we cleanup the configuration:

cp .config backup-conf.txt

and finally compile the kernel:

make -j2 ARCH=arm64 CROSS_COMPILE=aarch64-linux-gnu-

(where '-j' defines how many cores you wish to utilise during the compilation.)

We should then find the kernel in arch/arm64/boot/Image.gz

Building the root filesystem

For the rootfs I will be using busybox (so I don't over complicate things) - the latest version is currently 1.26.2:

cd /home/limited/workspace
wget https://www.busybox.net/downloads/busybox-1.26.2.tar.bz2

and again we will cross-compile busybox:

cd busybox*
make ARCH=arm64 CROSS_COMPILE=aarch64-linux-gnu- defconfig

or for the GUI config:

make ARCH=arm64 CROSS_COMPILE=aarch64-linux-gnu- menuconfig

and then compile it with:

mkdir /home/limited/workspace/rootfs
make ARCH=arm64 CROSS_COMPILE=aarch64-linux-gnu- install CONFIG_PREFIX=/home/limited/workspace/rootfs

* Specifying the 'CONFIG_PREFIX' allows us to specify where the structure / root for the compiled files will end up. *

This is where everything failed for me - the compiler started complaining about missing glibc headers - however it turns out that Fedora does not provide these as the cross platform toolchain only works for compiling kernels - not userspace programs!

So I ended up download Debian Stretch (currently testing) to compile busybox instead.

The Debian package is called: gcc-aarch64-linux-gnu

sudo apt-get install gcc-aarch64-linux-gnu

and attempt to compile as above.

We also need to ensure that we have the appropriate shared libraries for busybox - usually i'd just use ldd on the executable - however I would need to run an arm version on ldd to get this working and because i'm feeling lazy i'm going to cheat a little and install the glibc library:

cd /home/limited/workspace
wget http://ftp.gnu.org/gnu/libc/glibc-2.25.tar.bz2
tar xvf glibc*
mkdir buildc && cd buildc
../glibc-2.25/configure aarch64-linux-gnu- --target=aarch64-linux-gnu- --build=i686-pc-linux-gnu --prefix= --enable-add-ons
make
make install install_root=/home/limited/workspace/rootfs

We also need to create the directory structure for the rootfs:

mkdir proc sys dev etc/init.d usr/lib

We also need to ensure that the /proc and /sys filesystems mount on boot and that the dev nodes are populated:

vi etc/init.d/rcS

and add the following:

#!bin/sh
mount -t proc none /proc
mount -t sysfs none /sys
echo /sbin/mdev > /proc/sys/kernel/hotplug
/sbin/mdev -s

ensuring it's also executable:

chmod +x etc/init.d/rcS

TODO: Add user / SSH support.

Testing with QEMU

We should have a pretty bare bones filesystem - although we'll spin it up with QEMU firstly to ensure that everything comes up ok:

qemu-system-aarch64 -machine virt -cpu cortex-a57 -machine type=virt -nographic -smp 1 -m 512 -kernel Image --append "console=ttyAMA0" -initrd rootfs.img -append "root=/dev/ram rdinit=/sbin/init"

* Note: The last bit (append) is very important - as it instructs the kernel to use the the inird system as the root and ensures that the first program to run is /sbin/init. *

Testing on the Raspberry Pi

We'll now move the filesystem over to a new disk, along with the kernel and grub.

Our disk will have a 1GB boot partition formatted with FAT32 and a root partition of 15GB (we will skip swap etc. for this tutorial.)

Install GRUB to the new disk:

sudo grub-install --target=arm64-efi /dev/sdb

Sources:
Build busybox for ARM: http://wiki.beyondlogic.org/index.php?title=Cross_Compiling_BusyBox_for_ARM
Raspberry Pi Foundation: https://www.raspberrypi.org/documentation/linux/kernel/building.md