• My MySQL Cookbook

    Over the years I’ve been accumulating notes on how to do quick operations in a variety of technologies. These have been, up to now, for my eyes only, and they aim to be the documentation I wish I found online on how to quickly get something done in tech X. Today I decided to share some notes of one of those technologies: MySQL.

    Disclaimer: These notes are valid at the moment of writing and not for all versions of MySQL (I am testing 5.7.17 on macOS) as things continuously change.

    Disclaimer 2: You are responsible for your data. Read everything written here with a critical mind and only apply it once you tested it. And I can never emphasise this too much: do backups!

    What you can learn from this post

    • How to install, config and start MySQL server on macOS
    • Quick admin commands
    • Who are you on the MySQL console?
    • How to see running queries and stats
    • How to kill a hanging query
    • How to find slow queries
    • How many queries are running and how many are sleeping connections?
    • How to set the timeout for queries
    • How to set the max connections in MySQL CLI
    • How to set the max connections in the configuration file
    • How to find the maximum concurrent user connections that happened on this server
    • How to configure the Innodb buffer pool
    • How to set the character set in MySQL
    • How to get the top 10 biggest tables in size
    • How to get the oldest tables with respect to update times
    • How to get the data size (in MB) of the different databases
    • How to get all users in MySQL and their permissions
    • How to get the permissions for user MYUSER
    • How to get the permissions for user ‘website’ at IP ‘1.1.1.1’
    • How to give full permission for user ‘myuser’ at localhost on the database ‘magento’
    • How to revoke permissions
    • How to reset the password for user MYUSER
    • How to reset the root password
    • How to set user quotas
    • How to import bulk data faster into MySQL
    • How to export a table to a csv file
    • How to get the table DDL (definitions file)
    • How to do backups
    • How to log a session in MySQL
    • How to check if a table is healthy, fragmented and how to repair it

    First things, first

    How to install, config and start MySQL server on macOS

    1. If you are just trying these out and don’t happen to have a MySQL server to connect to you can download it from here, uncompress the DMG file and click next until it’s installed.
    2. Unfortunately the package does not alias the commands we need for direct use on the terminal so add the following to your .bash_profile:
    echo "alias mysql=/usr/local/mysql/bin/mysql" >> ~/.bash_profile
    echo "alias mysqladmin=/usr/local/mysql/bin/mysqladmin" >> ~/.bash_profile
    echo "alias mysqldumpslow=/usr/local/mysql/bin/mysqldumpslow" >> ~/.bash_profile
    echo "alias mysqldump=/usr/local/mysql/bin/mysqldump" >> ~/.bash_profile
    source ~/.bash_profile
    
    1. The only configuration we are going to make for now is to set a little file called .my.cnf to have our username and password (which was generated during the installation step). This is obviously not safe for production environments but for a playground it saves us typing the username and password every single time. So do the following: vim ~/.my.cnf and add the following to the file:
    [client]
    user=root
    password=yourSecurePassword
    

    For security’s sake please change the permissions on the file so that only you can see it: chmod 0600 ~/.my.cnf.

    1. Now let us start the server. Go to System Preferences and you can see that there is a new icon on the bottom line called MySQL, click there and then on Start MySQL Server (input your password when it asks you)

    2. Go back to the terminal and type mysql and voilá! We’re in!

    3. The password that was generated by the install package is temporary, so I would advise changing it once you’re logged in like this:

      mysql> SET PASSWORD FOR 'root'@'localhost' = PASSWORD('myNewPassword');
      Query OK, 0 rows affected, 1 warning (0.00 sec)
      

    Quick admin commands

    # Show all databases
    show databases;
    
    # See the SQL for the creation of a table
    show create table <table>;
    
    # See all the environment variables
    show global status;
    
    # Set the contents of a global variable
    set global <VARIABLE_NAME>=<VALUE>;
    
    # Get the contents of all variables starting with 'wait'
    show global variables like 'wait%';
    
    # Find all users on this database
    select User, Host from mysql.user;
    

    Who am I?

    Not to get all philosophical on you but if you at one point forget the user you are logged in as do this to find out:

    SELECT USER(),CURRENT_USER();
    

    USER() reports how you attempted to authenticate in MySQL CURRENT_USER() reports how you were allowed to authenticate in MySQL

    On queries

    How to see running queries and stats

    mysqladmin proc stat --verbose
    +----+------+-----------+----+---------+------+----------+-----------------------+
    | Id | User | Host      | db | Command | Time | State    | Info                  |
    +----+------+-----------+----+---------+------+----------+-----------------------+
    | 10 | root | localhost |    | Query   | 0    | starting | show full processlist |
    +----+------+-----------+----+---------+------+----------+-----------------------+
    Uptime: 404  Threads: 1  Questions: 11  Slow queries: 0  Opens: 106  Flush tables: 1  Open tables: 99  Queries per second avg: 0.027
    

    This command allows us to see what’s going on without actually having to log in to the MySQL CLI. You get all the running queries, the uptime of the server, how many of those queries are slow, how many queries are being executed per second, if a query has gone zombie on you and a few others.

    As you can see in the Info bit, one of the command being encapsulated by this one is

    mysql> show full processlist;
    

    This means that you can do the same things in the MySQL CLI.

    How to kill a hanging query

    Imagine that, from the list above, you have one query on your list that is taking forever (and by forever I mean a completely subjective number of seconds in the Time column). Then you can simply kill it like you would a process on your Mac.

    mysql> show full processlist;
    +----+------+-----------+------+---------+------+------------+-----------------------+
    | Id | User | Host      | db   | Command | Time | State      | Info                  |
    +----+------+-----------+------+---------+------+------------+-----------------------+
    | 11 | root | localhost | NULL | Query   |    0 | starting   | show full processlist |
    | 12 | root | localhost | NULL | Query   |   17 | User sleep | select sleep(100)     |
    +----+------+-----------+------+---------+------+------------+-----------------------+
    2 rows in set (0.00 sec)
    mysql> kill 12;
    Query OK, 0 rows affected (0.00 sec)
    

    How to find slow queries

    MySQL not only comes with an opinion on what a “slow query” means, but also comes with a nifty little tool to parse the slow query log files and show you a summary called mysqldumpslow.

    From the MySQL docs:

    The slow query log consists of SQL statements that took more than long_query_time seconds to execute and required at least min_examined_row_limit rows to be examined. The minimum and default values of long_query_time are 0 and 10, respectively.

    And a caveat:

    By default, administrative statements are not logged, nor are queries that do not use indexes for lookups. This behavior can be changed using log_slow_admin_statements and log_queries_not_using_indexes

    By default, slow queries are not logged. So let’s activate them!

    # Activate slow query log
    mysql> set global slow_query_log='ON';
    # max 5 seconds
    mysql> set global long_query_time='5';
    # These values are valid ONLY until the database server is restarted! As an alternative do it in the config file
    # If not already created, create the file
    sudo touch /etc/my.cnf
    # Change owner to yourself
    sudo chown `whoami` /etc/my.cnf
    # Add content
    echo "[mysqld]" >> /etc/my.cnf
    echo "slow_query_log=1" >> /etc/my.cnf
    # You can also activate the following, but beware that the logfile may grow fast
    echo "log_queries_not_using_indexes=1" >> /etc/my.cnf
    echo "log_slow_admin_statements=1" >> /etc/my.cnf
    

    If you want to find out where the log is going to:

    mysql> show global variables like 'slow_query_log_file';
    +---------------------+----------------------------------------------+
    | Variable_name       | Value                                        |
    +---------------------+----------------------------------------------+
    | slow_query_log_file | /usr/local/mysql/data/<YOUR_LAPTOP>-slow.log |
    +---------------------+----------------------------------------------+
    1 row in set (0.00 sec)
    

    So now you can test this!

    mysql> select sleep(11);
    +-----------+
    | sleep(11) |
    +-----------+
    |         0 |
    +-----------+
    1 row in set (11.01 sec)
    

    And then sudo tail -f /usr/local/mysql/data/<YOUR_LAPTOP>-slow.log

    /usr/local/mysql/bin/mysqld, Version: 5.7.17 (MySQL Community Server (GPL)). started with:
    Tcp port: 3306  Unix socket: /tmp/mysql.sock
    Time                 Id Command    Argument
    # Time: 2017-04-09T12:57:07.073262Z
    # [email protected]: root[root] @ localhost []  Id:     5
    # Query_time: 11.005275  Lock_time: 0.000000 Rows_sent: 1  Rows_examined: 0
    SET timestamp=1491742627;
    select sleep(11);
    

    Now you can use mysqldumpslow to show you a nice summary of the slow queries of your database (once you have a few):

    sudo /usr/local/mysql/bin/mysqldumpslow /usr/local/mysql/data/<YOUR_LAPTOP>-slow.log
    
    # Or the top 10 slow queries by count
    sudo /usr/local/mysql/bin/mysqldumpslow -t 10 -s c /usr/local/mysql/data/<YOUR_LAPTOP>-slow.log
    
    # Or the top 10 slow queries by time spent
    sudo /usr/local/mysql/bin/mysqldumpslow -t 10 -s t /usr/local/mysql/data/<YOUR_LAPTOP>-slow.log
    

    How many queries are running and how many are sleeping connections?

    # Running queries
    mysqladmin processlist |wc -l 
    # Sleeping queries
    mysqladmin processlist |grep -i sleep|wc -l 
    

    Configuration

    How to set the timeout for queries

    # What is the default duration that a query can run for?
    mysql> show global variables like 'wait_timeout';
    +---------------+-------+
    | Variable_name | Value |
    +---------------+-------+
    | wait_timeout  | 28800 |
    +---------------+-------+
    1 row in set (0.01 sec)
    # This means 8 hours!
    # Let's set it to 10 minutes
    mysql> set global wait_timeout=600;
    Query OK, 0 rows affected (0.00 sec)
    

    How to set the max connections in MySQL CLI

    mysql> show global variables like 'max_connections';
    +-----------------+-------+
    | Variable_name   | Value |
    +-----------------+-------+
    | max_connections | 151   |
    +-----------------+-------+
    1 row in set (0.00 sec)
    
    mysql> set global max_connections=170;
    Query OK, 0 rows affected (0.00 sec)
    
    mysql> show global variables like 'max_connections';
    +-----------------+-------+
    | Variable_name   | Value |
    +-----------------+-------+
    | max_connections | 170   |
    +-----------------+-------+
    1 row in set (0.00 sec)
    
    # This value is valid ONLY until the database server is restarted!
    

    How to set the max connections in the configuration file

    # If you're on a Mac, you have to create the file and add the max_connections to it
    sudo touch /etc/my.cnf
    # Change owner to yourself
    sudo chown `whoami` /etc/my.cnf
    # Add content
    echo "[mysqld]" >> /etc/my.cnf
    echo "max_connections=505" >> /etc/my.cnf
    # Go to System Preferences and restart MySQL
    # And query again
    mysql> show global variables like 'max_connections';
    +-----------------+-------+
    | Variable_name   | Value |
    +-----------------+-------+
    | max_connections | 505   |
    +-----------------+-------+
    1 row in set (0.00 sec)
    

    How to find the maximum concurrent user connections that happened on this server

    mysql> show global status like "%max_used_connections%";
    

    How to configure the Innodb buffer pool

    Ideally you want the buffer pool size to be equal to the size of your innodb data files. If you can’t get to that:

    On a dedicated database server, you might set the buffer pool size to 80% of the machine’s physical memory size.

    You can set the buffer pool size in the my.cnf config file like so:

    [mysqld]
    ...
    innodb_buffer_pool_size = 2G
    ...
    

    Or dynamically (from MySQL v.5.7.5) on the MySQL CLI like so:

    # 2G x 1024 x 1024 x 1024 to get bytes
    mysql> set global innodb_buffer_pool_size=2147483648;
    

    How to set the character set in MySQL

    # Add to [mysqld] section of /etc/my.cnf:
    collation-server = utf8_unicode_ci
    init-connect='SET NAMES utf8'
    

    Getting info

    How to get the top 10 biggest tables in size

    SELECT TABLE_SCHEMA, TABLE_NAME,data_length/1024*1024 FROM `information_schema`.`TABLES` WHERE `TABLE_SCHEMA` !='information_schema' AND `TABLE_SCHEMA` !='mysql' ORDER BY `data_length` DESC LIMIT 10; 
    

    How to get the oldest tables with respect to update times

    SELECT CONCAT (`TABLE_SCHEMA`, "." , `TABLE_NAME`) as name , `UPDATE_TIME` FROM `TABLES` WHERE `UPDATE_TIME` IS NOT NULL ORDER BY `UPDATE_TIME` LIMIT 10; 
    

    How to get the data size (in MB) of the different databases

    SELECT table_schema "Database Name", sum( data_length + index_length ) / 1024 / 1024 "Data Base Size in MB" FROM information_schema.TABLES GROUP BY table_schema ;
    

    Users

    How to get all users in MySQL and their permissions

    SELECT * from `INFORMATION_SCHEMA`.`USER_PRIVILEGES`;
    

    How to get the permissions for user MYUSER

    SELECT * from INFORMATION_SCHEMA.USER_PRIVILEGES where grantee like '%MYUSER%';
    

    How to get the permissions for user ‘website’ at IP ‘1.1.1.1’

    SHOW GRANTS FOR website@'1.1.1.1'; 
    

    How to give full permission for user ‘myuser’ at localhost on the database ‘magento’

    GRANT ALL PRIVILEGES ON magento.* TO 'myuser'@'localhost';
    

    How to revoke permissions

    REVOKE SELECT,INSERT,UPDATE,DELETE ON mydatabase.* FROM 'myuser'@'localhost';
    

    How to reset the password for user MYUSER

    use mysql;
    set password for '<MYUSER>'@'%' = PASSWORD('myPass');
    flush privileges;
    

    How to reset the root password

    # Add this line to the beginning of /etc/my.cnf ([mysqld] section):
    init_file=/var/lib/mysql/mysql.init
    
    # Create file /var/lib/mysql/mysql.init with the following contents:
    SET PASSWORD FOR 'root'@'localhost's=PASSWORD('mynewpassword');
    FLUSH PRIVILEGES;
    
    # Change the owner to the local MySQL user
    chown <MYSQL_USER> /var/lib/mysql/mysql.init
    
    # Restart the server
    # Delete the mysql.init file and remove the init_file line from /etc/my.cnf
    

    How to set user quotas

    This is done through the GRANTS that the user has on the database. You can control MAX_CONNECTIONS_PER_HOUR, MAX_QUERIES_PER_HOUR, MAX_UPDATES_PER_HOUR, MAX_USER_CONNECTIONS.

    One example:

    GRANT ALL ON mydatabase.* TO 'myuser'@'%' WITH MAX_QUERIES_PER_HOUR 100 MAX_UPDATES_PER_HOUR 10;
    

    Import/Export

    How to import bulk data faster into MySQL

    # The trick here is to disable indexes for the duration of the bulk import, so...
    ALTER TABLE table_name DISABLE KEYS;
    # ...import...
    ALTER TABLE table_name ENABLE KEYS;
    

    How to export a table to a csv file

    mysql <MYDATABASE> -e "SELECT * FROM <MYTABLE>" | sed 's/\t/","/g;s/^/"/;s/$/"/;' > <MYTABLE>.csv
    

    How to get the table DDL (definitions file)

    mysqldump -f -d --single-transaction <MYTABLE> > ~/Desktop/createMyTable.sql
    

    How to do backups

    # Backup all databases
    mysqldump --all-databases > ~/Desktop/fullBackup.sql
    
    # Backup just mydb
    mysqldump --databases mydb > ~/Desktop/mydbBackup.sql
    
    # Backup just mytable in mydb
    mysqldump mydb mytable > ~/Desktop/mytableBackup.sql
    

    How to log a session in MySQL

    mysql> tee mysession.log
    Logging to file 'mysession.log'
    

    Others

    How to check if a table is healthy, fragmented and how to repair it

    mysql> check table <MYTABLE>;
    mysql> repair table <MYTABLE>;
    

    I hope this helps!

    If you find mistakes with these recipes, feel free to raise an issue here.

  • How to make Jekyll multilingual

    One of the few requirements that existed for this blog was to be multilingual. This is something that one can easily do with WordPress these days, but then I got turned on to GitHub Pages and one thing led to another and …. many hours later I had a blog that became multilingual, through mercilessly hacking the beautiful Whiteglass theme. Here is a brief explanation of how I made it happen.

    Disclaimer: The code is not pretty. Consider it an MVP. :)

    What you can learn from this post

    • How to turn a Jekyll theme multilingual

    Where I started

    Google of course! My thinking was “There are multiple plugins for this in WordPress, there must be one for Jekyll”. And… I was wrong. For Jekyll, only one language at a time is allowed to exist in the world. But I did come across some very useful blog posts:

    • This one by Sylvain Durand was the most useful and insightful

    • The one from developmentseed was not as detailed but still insightful in using categories for languages

    • Finally these two: by drallgood and by Anthony Gaudino are good alternatives, but I already had a theme, I was limited in terms of plugins because of GitHub Pages and did not want to keep using translations inside the posts themselves

    So I was stuck, and turned back to the author of the theme

    Yous, the person who coded the Whiteglass theme, was super helpful and although he/she said that making the theme multilingual was not in the roadmap, he/she offered some tips on how to do it.

    Here’s how…

    _config.yml

    In _config.yml I changed three things (relevant to the multilingual business)

    lang: en
    languages: ["en", "pt"]
    permalink: /:categories/:year/:month/:day/:title/
    

    Line 1 establishes the general default language of the blog

    Line 2 establishes an array of possible languages

    Line 3 establishes that the URLs should include categories which will be used later

    _data/navigation.yml

    The _data/navigation.yml file was changed to add menus for the different languages as well

    languages:
      - language: "en"
        links:
        - title: "About"
          url: /about/
        - title: "Archives"
          url: /archives/
        - title: "GitHub"
          url: https://github.com/minac
        - title: "pt"
          url: /pt/
      - language: "pt"
        links:
        - title: "Sobre"
          url: /sobre/
        - title: "Arquivos"
          url: /arquivos/
        - title: "GitHub"
          url: https://github.com/minac
        - title: "en"
          url: /
    

    There is no explaining to do, two languages, two subtrees with the correct URLs for either.

    _includes/header.html

    Then I needed to incorporate that into the _includes/header.html

    {% for item in site.data.navigation.languages %}
      {% if item.language == page.lang %}
        {% for link in item.links %}
          {% if link.url contains "http" %}
            {% assign url = link.url %}
          {% else %}
            {% assign url = link.url | relative_url %}
          {% endif %}
          <a class="page-link" href="{{ url }}">{{ link.title }}</a>
        {% endfor %}
      {% endif %}
    {% endfor %}
    

    Only the first 2 lines are important here, a loop and an if clause to show the navigation in the right language.

    _layouts/archive.html

    Then the magic starts happening in the _layouts/archive.html

    This <h1 class="page-heading">Blog Archive</h1> becomes this <h1 class="page-heading">{{ page.title }}</h1>. To avoid hard coding titles.

    This {% for post in site.posts %} becomes this:

    {% assign posts=site.posts | where: "lang", page.lang %}
    {% for post in posts %}
    ...
    {% endfor %}
    

    So that the archives page is language aware.

    And finally this:

    <span class="post-meta">{{ post.date | date: "%b %-d, %Y" }}{% if post.categories != empty %} • {% include category_links.html categories=post.categories %}{% endif %}</span>
    

    Becomes this:

    <span class="post-meta">{{ post.date | date: "%b %-d, %Y" }}{% if post.tags != empty %} • {% for tag in post.tags %}{{ tag }}{% endfor %}{% endif %}</span>
    

    Because we are using categories for the languages, I went with tags for the categorization of the posts. That way each post will have one or more tags but the URL will only reflect the categories themselves.

    _layouts/page.html and _layouts/post.html

    Once the hard work is done, the _layouts/page.html and _layouts/post.html become easy.

    Here’s some meta data added to the page’s post-header:

    <header class="post-header">
      <h1 class="post-title">{{ page.title | escape }}</h1>
      <p class="post-meta"><time datetime="{{ page.date | date_to_xmlschema }}" itemprop="datePublished">{{ page.date | date: "%b %-d, %Y" }}</time> • {% assign pages=site.pages | where: "ref", page.ref | sort: 'lang' %}{% for page in pages %}<a href="{{ page.url }}" class="{{ page.lang }}">{{ page.lang }}</a> {% endfor %}</p>
    </header>
    

    And some meta data added to the post’s post-header:

    <p class="post-meta"><time datetime="{{ page.date | date_to_xmlschema }}" itemprop="datePublished">{{ page.date | date: "%b %-d, %Y" }}</time>{% if page.tags != empty %} • {% for tag in page.tags %}{{ tag }} • {% endfor %}{% endif %} {% assign posts=site.posts | where: "ref", page.ref | sort: 'lang' %}{% for post in posts %}<a href="{{ post.url }}" class="{{ post.lang }}">{{ post.lang }}</a> {% endfor %} • {{ content | number_of_words }} words</p>
    

    index.html

    I decided out of lazyness to use the index.html instead of home which the theme author suggests. Here are the contents:

    <div class="home">
      {% capture site_lang %}{{ site.lang | default: "en" }}{% endcapture %}
      {% assign posts=site.posts | where:"lang", page.lang %}
      <ul class="post-list">
        {% for post in posts %}
          {% capture lang %}{% if post.lang != site_lang %}{{ post.lang }}{% else %}{{ site_lang }}{% endif %}{% endcapture %}
          <li{% if lang != empty %} lang="{{ lang }}"{% endif %}>
            <header class="post-header">
              <h1 class="post-title">
                <a class="post-link" href="{{ post.url | relative_url }}">{{ post.title | escape }}{% if post.external-url %} &rarr;{% endif %}</a>
              </h1>
              <p class="post-meta">{{ post.date | date: "%b %-d, %Y" }}{% if post.tags != empty %} • {% for tag in post.tags %} {{ tag }} • {% endfor %}{% endif %}</p>
            </header>
            <div class="post-content">
              {{ post.excerpt }}
            </div>
            {% if post.content contains site.excerpt_separator %}
              <p class="post-continue">
                <a href="{{ post.url | relative_url }}">Read on &rarr;</a>
              </p>
            {% endif %}
          </li>
        {% endfor %}
      </ul>
      {% include pagination.html %}
    </div>
    

    Line 2, 3 and 6 are the only ones that matter for this purpose.

    ‘about’ becomes ‘sobre’, ‘archives’ become ‘arquivos’

    After all this was done I needed to create Portuguese pages for the English equivalents. So about.md got a sister sobre.md and archives.md got arquivos.md. The contents are not relevant.

    File tree

    So in the end this is the tree of files of the project:

    ./_config.yml
    ./_data
    ./_data/navigation.yml
    ./_includes
    ./_includes/fonts.html
    ./_includes/footer.html
    ./_includes/footer_content.html
    ./_includes/google_analytics.html
    ./_includes/head.html
    ./_includes/head_custom.html
    ./_includes/header.html
    ./_includes/pagination.html
    ./_layouts
    ./_layouts/archive.html
    ./_layouts/category_archives.html
    ./_layouts/default.html
    ./_layouts/feed.xml
    ./_layouts/page.html
    ./_layouts/post.html
    ./_sass
    ./_sass/whiteglass
    ./sass/whiteglass/base.scss
    ./sass/whiteglass/layout.scss
    ./sass/whiteglass/syntax-highlighting.scss
    ./_sass/whiteglass.scss
    ./about.md
    ./sobre.md
    ./archives.md
    ./arquivos.md
    ./assets
    ./assets/main.scss
    ./feed.xml
    ./Gemfile
    ./index.html
    ./en
    ./en/_posts
    ./en/_posts/2017-03-04-new-blog-new-life.md
    ./pt
    ./pt/_posts
    ./pt/_posts/2017-03-04-new-blog-new-life.md
    ./pt.html
    

    Most of the files were already part of the theme. The ones that I would like to draw your attention to are on lines 26/27, 28/29 and the English directory 35-37 and the Portuguese directory 38-40. pt.html is the unimaginatively named Portuguese index.html.

    So now that all of that is done, how does one create a new post in Portuguese and English (or either one)?

    Front matter for new posts

    Well, see ./en/_posts/2017-03-04-new-blog-new-life.md and ./pt/_posts/2017-03-04-new-blog-new-life.md? Here are the respective Front matters:

    ---
    layout: post
    title: "New blog, new life"
    tags: [blog]
    author: "Miguel David"
    date: 2017-03-04 16:12:07 +0000
    lang: en
    ref: new-blog-new-life
    ---
    

    And for the Portuguese one:

    ---
    layout: post
    title: "Novo blog, nova vida"
    tags: [blog]
    author: "Miguel David"
    date: 2017-03-04 16:12:07 +0000
    lang: pt
    ref: new-blog-new-life
    ---
    

    What distinguises them (besides the contents after the front matter) is their localized title and lang. Everything else is the same, including the reference which is used to change between both languages when looking at the specific post!

    I hope this helps!

  • Podcasts I listen to

    I was asked the other day by a friend going on a long trip which podcasts I listen to. I typically listen to them while running, but they’re all safe for driving as well. Disclaimer: The list is entirely personal and subjective.

    What you can learn from this post

    • Podcasts which I find interesting and actively listen to in different categories

    Entrepeneurial

    Economics

    Tech

    Inspiring

    I hope you enjoy!

  • An ansible role for Anchor CMS

    What you can learn from this post

    • How to install Anchor CMS
    • How to write an ansible role ready for Ansible Galaxy

    What we want to do

    In broad strokes, this is what we want to do:

    1. Update your system apt-get update && apt-get upgrade

    2. Install the required packages apt-get install unzip vim apache2 libapache2-mod-php5 mysql-server php5-curl php5-mcrypt php5-gd php5-mysql

    3. Secure your mysql install

    4. Edit /root/.my.cnf in case you want to get into Mysql from the console without typing a password vim /root/.my.cnf

    [client]
    user = root
    password = <YourPassword>
    
    1. Create the database To get in: mysql To create the database: mysql> create database anchor; quit

    2. Check your firewall to allow for ports SSH and HTTP outbound ufw status or iptables -nvL

    3. Get anchor cd /var/www/html; wget http://anchorcms.com/download -O anchor.zip; unzip anchor.zip

    4. Adjust permissions chown -R www-data:www-data anchor-cms

    5. Setup apache to allow for .htaccess sed -i 's/AllowOverride None/AllowOverride All/g' /etc/apache2/apache2.conf

    6. Enable apache modules a2enmod rewrite

    7. Add the .htaccess to the base directory ( /var/www/html/anchor-cms/.htaccess )

    Options -indexes
    <IfModule mod_rewrite.c>
    RewriteEngine On
    RewriteBase /
    
    # Allow any files or directories that exist to be displayed directly
    RewriteCond %{REQUEST_FILENAME} !-f
    RewriteCond %{REQUEST_FILENAME} !-d
    
    # Rewrite all other URLs to index.php/URL
    RewriteRule ^(.*)$ index.php/$1 [L]
    </IfModule>
    
    <IfModule !mod_rewrite.c>
    ErrorDocument 404 index.php
    </IfModule>
    
    1. Restart services service apache2 restart; service mysql-server restart

    2. Go to http:/// (or http://localhost:8080 in case of Vagrant) to start the installation.

    Now with Ansible

    So I spent an afternoon and got it done using Ansible! This is version 1 so I’m sure it can be vastly improved. Oh! And it has a Vagrantfile in case you want to try it out locally with vagrant up.

    You can see the code here or the Galaxy role here!

    There is one step that I could not automate because it has to be done after the configuration and installation process of Anchor itself which you can do manually:

    rm -rf /var/www/html/anchor-cms/install

    Enjoy!

  • New blog, new life

    Welcome! This blog has had several iterations over the years. Most of its life was in a self hosted WordPress, a bit in Anchor, a bit in Ghost and finally in Medium. As the continuous experiment that it is, now it’s a mix of Jekyll and Github Pages fronted by the amazing Cloudflare. It’s fun and easy to blog again and I don’t have to pay anyone to host me, so expect a bunch of small posts coming up!

  • How to deny hot linking from your AWS S3 files

    There are many good people in the web, but there are also those who are lazy and instead of creating content themselves or even hosting it, they will want to abuse the existing stuff online. For example, say you have a bucket (originally) called mybucket and a file called … myfile.jpg. You uploaded the file to AWS S3 and now you are serving it from there through a webpage that we’ll call test.html.
    Here’s what the URL of the file could look like:
    https://s3.eu-central-1.amazonaws.com/mybucket/myfile.jpg

    And your test page could be like:

    <html><body><br> Hello<br> <img
    src=”https://s3.eu-central-1.amazonaws.com/mybucket/myfile.jpg" alt=”” /><br>
    </body></html>
    

    Now Mr. Evil comes along and likes your myfile.jpg but does not want to pay for the bandwidth for when someone goes into his website to see your myfile.jpg. How does he do it? Simply:

    <html><body><br> Mr. Evil’s webpage<br> <img
    src=”https://s3.eu-central-1.amazonaws.com/mybucket/myfile.jpg" alt=”” /><br>
    </body></html>
    

    Every time someone visits his page, not only are they seeing your file but the bandwidth charges are coming to you because they are in effect accessing the file in your AWS account!

    How to prevent this? Using a bucket policy. Go to S3 in your account, then click on your bucket, then click on Permissions and Edit bucket policy. You will see an empty text box. Copy paste the code below and adjust to your bucket name and your website URLs:

    {
    	"Version": "2012-10-17",
    	"Statement": [
    		{
    			"Sid": "AddPerm",
    			"Effect": "Allow",
    			"Principal": "*",
    			"Action": "s3:GetObject",
    			"Resource": "arn:aws:s3:::mybucket/*"
    		},
    		{
    			"Sid": "Deny requests for hot-linking.",
    			"Effect": "Deny",
    			"Principal": "*",
    			"Action": "s3:GetObject",
    			"Resource": "arn:aws:s3:::mybucket/*",
    			"Condition": {
    				"StringNotLike": {
    					"aws:Referer": [
    						"http://*.mywebsite.com/*",
    						"https://*.mywebsite.com/*"
    					]
    				}
    			}
    		}
    	]
    }
    

    NOTE: If you have Cloudfront in front of your S3 bucket, then other rules apply. I’ll cover that in another post.

    What is happening here is that in the first statement we are allowing everyone to get our files in this bucket (required for web access to the world), but then in the second statement we are saying that S3 should only allow the objects to be available if the referral is a variation of mywebsite.com (in this case with and without www and https).

    What happens now? When anyone visits your test page they will be able to see the file, but when they go to Mr. Evil’s page they will see a broken link where the image should be. :)

    Take that Mr. Evil!

  • S3/Cloudfront Access Denied

    So you finally listened the SEO guys who keep saying that the images on your website have to be fast and you decided to move them from your small cloud server to an AWS S3 bucket? You managed to create the bucket and even enabled Cloudfront on that bucket so that the images (or other files) are not only in one region of AWS but everywhere in the world (well, every edge node on AWSs Content Delivery Network). This will make sure that someone in Japan will see your site as fast as someone in Oregon, USA. You then uploaded the files and changed your website code to point, not to your local server files but to the cloudfront URL (or S3 bucket URL), but when the moment of truth comes… The images are not showing up and when you go directly to the link of the image you get a nasty XML saying “Access Denied”.

    The good news is that you are not the first person to go through this so the process of adding the permissions is straigh forward:

    1. Go to AWS S3 and click on your bucket
    2. Click on properties
    3. Click on Edit bucket policy
    4. Copy the following into the text box, replace examplebucket with your bucket name and you’re done

      { “Version”:”2012-10-17”, “Statement”:[ { “Sid”:”AddPerm”, “Effect”:”Allow”, “Principal”: “”, “Action”:[“s3:GetObject”], “Resource”:[“arn:aws:s3:::examplebucket/”] } ] }

    What are we doing here exactly? We are allowing everyone in the world to be able to get objects from your S3 bucket. Don’t worry, no-one will have access to creating/deleting/changing your files, only to see all the files in that bucket (which should be the bucket with your publicly available web files anyway).

  • Today I had some fun with cows saying fortunes

    Today I had some fun with cows saying fortunes. If you are not in the Linux world, it will seem like I’ve been smoking some strange Dalmatian herbs, but if you have played at all with Linux you’ll know this happy face:

    _________________________________________
    / You will be reincarnated as a toad; and \
    \ you will be much happier.               /
     -----------------------------------------
           \ (__)
             (oo)
       /------\/
      / |    ||
     *  /\---/\
        ~~   ~~
    

    Now, who said geeks don’t know how to have fun?

    To make it less boring I made a little script to output short fortunes with random animals when I log on to my servers and here is the result. Go play with it, have a laugh every time you log into a server of yours. ;)

    I leave you with this pearl of wisdom:

    ________________________________
    / When in doubt, tell the truth. \
    |                                |
    \ -- Mark Twain                  /
     --------------------------------
     \
      \
       \ >()_
          (__)__ _
    
  • nova-agent not starting with ‘import site’ failed

    It’s amazing how long it’s been since I last posted here! I’ve moved to a new role more on the Infrastructure side of things and have been busy.

    Today I wanted to share how to correct a bug that nova-agent version 0.0.1.38 has.

    Problem

    When starting it you would see this

    # service nova-agent restart
    'import site' failed; use -v for traceback
    

    Unhelpful, right?

    Investigation

    Looking at the log shows that the problem comes from the xscomm.py script:

    # tail /var/log/nova-agent.log
    2014-05-06 17:57:17,299 [ERROR] [EXC] InvalidArgError: Invalid argument specified
    2014-05-06 17:57:17,299 [ERROR] failed to parse config file '/usr/share/nova-agent/nova-agent.py'
    2014-05-06 17:59:50,892 [ERROR] Failed to run python code: A python exception has occurred:
    2014-05-06 17:59:50,892 [ERROR] [EXC] Traceback (most recent call last):
    2014-05-06 17:59:50,892 [ERROR] [EXC]   File "/usr/share/nova-agent/nova-agent.py", line 41, in <module>
    2014-05-06 17:59:50,892 [ERROR] [EXC]     xs = plugins.XSComm()
    2014-05-06 17:59:50,892 [ERROR] [EXC]   File "xscomm.py", line 44, in __init__
    2014-05-06 17:59:50,892 [ERROR] [EXC] InvalidArgError: Invalid argument specified
    2014-05-06 17:59:50,893 [ERROR] failed to parse config file '/usr/share/nova-agent/nova-agent.py'
    

    A quick find will show that that script is a compiled python script here:

    # ls -lah /usr/share/nova-agent/0.0.1.38/plugins
    total 28K
    drwx------ 2 root root 4.0K May  6 18:05 .
    drwx------ 7 root root 4.0K May  6 18:01 ..
    -rwx------ 1 root root  253 Nov  7 18:17 __init__.pyc
    -rwx------ 1 root root 3.1K Nov  7 18:17 jsonparser.pyc
    -rw-r--r-- 1 root root 3.5K May  6 18:05 xscomm.pyc
    

    Hmm… compiled eh?

    Solution

    Ok, let’s go to the source on github. Once there you can see that xscomm.py has been updated 3 months ago because of some startup failures. Looks promising, so let’s go ahead and get it:

    # cd /usr/share/nova-agent/0.0.1.38/plugins
    
    # wget 
    --2014-05-06 18:04:41--  
    Resolving raw.githubusercontent.com (raw.githubusercontent.com)... 185.31.19.133
    Connecting to raw.githubusercontent.com (raw.githubusercontent.com)|185.31.19.133|:443... connected.
    HTTP request sent, awaiting response... 200 OK
    Length: 5709 (5.6K) [text/plain]
    Saving to: `xscomm.py'
    
    100%[==============================================================>] 5,709       --.-K/s   in 0s
    
    2014-05-06 18:04:41 (103 MB/s) - `xscomm.py' saved [5709/5709]
    
    Move the old one out of the way:
    
    # mv xscomm.pyc /tmp
    
    Aannnd start!
    
    # /etc/init.d/nova-agent restart && tail -1 /var/log/nova-agent.log
    
    2014-05-06 18:05:09,903 [INFO] Agent 0.0.1.38 started
    
    I hope this helps.