Home News Feeds Planet MySQL
Planet MySQL
Planet MySQL - https://planet.mysql.com

  • Several Ways to Intentionally Fail or Crash your MySQL Instances for Testing
    You can take down a MySQL database in multiple ways. Some obvious ways are to shut down the host, pull out the power cable, or hard kill the mysqld process with SIGKILL to simulate an unclean MySQL shutdown behaviour. But there are also less subtle ways to deliberately crash your MySQL server, and then see what kind of chain reaction it triggers. Why would you want to do this? Failure and recovery can have many corner cases, and understanding them can help reduce the element of surprise when things happen in production. Ideally, you would want to simulate failures in a controlled environment, and then design and test database failover procedures. There are several areas in MySQL that we can tackle, depending on how you want it to fail or crash. You can corrupt the tablespace, overflow the MySQL buffer and caches, limit the resources to starve the server, and also mess around with permissions. In this blog post, we are going to show you some examples of how to crash a MySQL server in a Linux environment. Some of them would be suitable for e.g. Amazon RDS instances, where you would have no access to the underlying host. Kill, Kill, Kill, Die, Die, Die The easiest way to fail a MySQL server is to simply kill the process or host, and not give MySQL a chance to do a graceful shutdown. To simulate a mysqld crash, just send signal 4, 6, 7, 8, 11 to the process: $ kill -11 $(pidof mysqld) When looking at the MySQL error log, you can see the following lines: 11:06:09 UTC - mysqld got signal 11 ; This could be because you hit a bug. It is also possible that this binary or one of the libraries it was linked against is corrupt, improperly built, or misconfigured. This error can also be caused by malfunctioning hardware. Attempting to collect some information that could help diagnose the problem. As this is a crash and something is definitely wrong, the information collection process might fail. .. Attempting backtrace. You can use the following information to find out where mysqld died. If you see no messages after this, something went terribly wrong... You can also use kill -9 (SIGKILL) to kill the process immediately. More details on Linux signal can be found here. Alternatively, you can use a meaner way on the hardware side like pulling off the power cable, pressing down the hard reset button or using a fencing device to STONITH. Triggering OOM Popular MySQL in the cloud offerings like Amazon RDS and Google Cloud SQL have no straightforward way to crash them. Firstly because you won't get any OS-level access to the database instance, and secondly because the provider uses a proprietary patched MySQL server. One ways is to overflow some buffers, and let the out-of-memory (OOM) manager to kick out the MySQL process. You can increase the sort buffer size to something bigger than what the RAM can handle, and shoot a number of mysql sort queries against the MySQL server. Let's create a 10 million rows table using sysbench on our Amazon RDS instance, so we can build a huge sort: $ sysbench \ --db-driver=mysql \ --oltp-table-size=10000000 \ --oltp-tables-count=1 \ --threads=1 \ --mysql-host=dbtest.cdw9q2wnb00s.ap-tokyo-1.rds.amazonaws.com \ --mysql-port=3306 \ --mysql-user=rdsroot \ --mysql-password=password \ /usr/share/sysbench/tests/include/oltp_legacy/parallel_prepare.lua \ run Change the sort_buffer_size to 5G (our test instance is db.t2.micro - 1GB, 1vCPU) by going to Amazon RDS Dashboard -> Parameter Groups -> Create Parameter Group -> specify the group name -> Edit Parameters -> choose "sort_buffer_size" and specify the value as 5368709120. Apply the parameter group changes by going to Instances -> Instance Action -> Modify -> Database Options -> Database Parameter Group -> and choose our newly created parameter group. Then, reboot the RDS instance to apply the changes. Once up, verify the new value of sort_buffer_size: MySQL [(none)]> select @@sort_buffer_size; +--------------------+ | @@sort_buffer_size | +--------------------+ | 5368709120 | +--------------------+ Then fire 48 simple queries that requires sorting from a client: $ for i in {1..48}; do (mysql -urdsroot -ppassword -hdbtest.cdw9q2wnb00s.ap-tokyo-1.rds.amazonaws.com -e 'SELECT * FROM sbtest.sbtest1 ORDER BY c DESC >/dev/null &); done If you run the above on a standard host, you will notice the MySQL server will be terminated and you can see the following lines appear in the OS's syslog or dmesg: [164199.868060] Out of memory: Kill process 47060 (mysqld) score 847 or sacrifice child [164199.868109] Killed process 47060 (mysqld) total-vm:265264964kB, anon-rss:3257400kB, file-rss:0kB With systemd, MySQL or MariaDB will be restarted automatically, so does Amazon RDS. You can see the uptime for our RDS instance will be resetted back to 0 (under mysqladmin status), and the 'Latest restore time' value (under RDS Dashboard) will be updated to the moment it went down. Corrupting the Data InnoDB has its own system tablespace to store data dictionary, buffers and rollback segments inside a file named ibdata1. It also stores the shared tablespace if you do not configure innodb_file_per_table (enabled by default in MySQL 5.6.6+). We can just zero this file, send a write operation and flush tables to crash mysqld: # empty ibdata1 $ cat /dev/null > /var/lib/mysql/ibdata1 # send a write $ mysql -uroot -p -e 'CREATE TABLE sbtest.test (id INT)' # flush tables $ mysql -uroot -p -e 'FLUSH TABLES WITH READ LOCK; UNLOCK TABLES' After you send a write, in the error log, you will notice: 2017-11-15T06:01:59.345316Z 0 [ERROR] InnoDB: Tried to read 16384 bytes at offset 98304, but was only able to read 0 2017-11-15T06:01:59.345332Z 0 [ERROR] InnoDB: File (unknown): 'read' returned OS error 0. Cannot continue operation 2017-11-15T06:01:59.345343Z 0 [ERROR] InnoDB: Cannot continue operation. At this point, mysql will hang because it cannot perform any operation, and after the flushing, you will get "mysqld got signal 11" lines and mysqld will shut down. To clean up, you have to remove the corrupted ibdata1, as well as ib_logfile* because the redo log files cannot be used with a new system tablespace that will be generated by mysqld on the next restart. Data loss is expected. For MyISAM tables, we can mess around with .MYD (MyISAM data file) and .MYI (MyISAM index) under the MySQL datadir. For instance, the following command replaces any occurrence of string "F" with "9" inside a file: $ replace F 9 -- /var/lib/mysql/sbtest/sbtest1.MYD Then, send some writes (e.g, using sysbench) to the target table and perform the flushing: mysql> FLUSH TABLE sbtest.sbtest1; The following should appear in the MySQL error log: 2017-11-15T06:56:15.021564Z 448 [ERROR] /usr/sbin/mysqld: Incorrect key file for table './sbtest/sbtest1.MYI'; try to repair it 2017-11-15T06:56:15.021572Z 448 [ERROR] Got an error from thread_id=448, /export/home/pb2/build/sb_0-24964902-1505318733.42/rpm/BUILD/mysql-5.7.20/mysql-5.7.20/storage/myisam/mi_update.c:227 The MyISAM table will marked as crashed and runing REPAIR TABLE statement is necessary to make it accessible again. Limiting the Resources We can also apply the operating system resource limit to our mysqld process, for example number of open file descriptors. Using open_file_limit variable (default is 5000) allows mysqld to reserve file descriptors using setrlimit() command. You can set this variable relatively small (just enough for mysqld to start up) and then send multiple queries to the MySQL server until it hits the limit. If mysqld is running in a systemd server, we can set it in the systemd unit file located at /usr/lib/systemd/system/mysqld.service, and change the following value to something lower (systemd default is 6000): # Sets open_files_limit LimitNOFILE = 30 Apply the changes to systemd and restart MySQL server: $ systemctl daemon-reload $ systemctl restart mysqld Then, start sending new connections/queries that count in different databases and tables so mysqld has to open multiple files. You will notice the following error: 2017-11-16T04:43:26.179295Z 4 [ERROR] InnoDB: Operating system error number 24 in a file operation. 2017-11-16T04:43:26.179342Z 4 [ERROR] InnoDB: Error number 24 means 'Too many open files' 2017-11-16T04:43:26.179354Z 4 [Note] InnoDB: Some operating system error numbers are described at http://dev.mysql.com/doc/refman/5.7/en/operating-system-error-codes.html 2017-11-16T04:43:26.179363Z 4 [ERROR] InnoDB: File ./sbtest/sbtest9.ibd: 'open' returned OS error 124. Cannot continue operation 2017-11-16T04:43:26.179371Z 4 [ERROR] InnoDB: Cannot continue operation. 2017-11-16T04:43:26.372605Z 0 [Note] InnoDB: FTS optimize thread exiting. 2017-11-16T04:45:06.816056Z 4 [Warning] InnoDB: 3 threads created by InnoDB had not exited at shutdown! At this point, when the limit is reached, MySQL will freeze and it will not be able to perform any operation. When trying to connect, you would see the following after a while: $ mysql -uroot -p ERROR 2013 (HY000): Lost connection to MySQL server at 'reading initial communication packet', system error: 104 Messing up with Permissions The mysqld process runs by "mysql" user, which means all the files and directory that it needs to access are owned by mysql user/group. By messing up with the permission and ownership, we can make the MySQL server useless: $ chown root:root /var/lib/mysql $ chmod 600 /var/lib/mysql Generate some loads to the server and then connect to the MySQL server and flush all tables onto disk: mysql> FLUSH TABLES WITH READ LOCK; UNLOCK TABLES; At this moment, mysqld is still running but it's kind of useless. You can access it via a mysql client but you can't do any operation: mysql> SHOW DATABASES; ERROR 1018 (HY000): Can't read dir of '.' (errno: 13 - Permission denied) To clean up the mess, set the correct permissions: $ chown mysql:mysql /var/lib/mysql $ chmod 750 /var/lib/mysql $ systemctl restart mysqld Lock it Down FLUSH TABLE WITH READ LOCK (FTWRL) can be destructive in a number of conditions. Like for example, in a Galera cluster where all nodes are able to process writes, you can use this statement to lock down the cluster from within one of the nodes. This statement simply halts other queries to be processed by mysqld during the flushing until the lock is released, which is very handy for backup processes (MyISAM tables) and file system snapshots. Although this action won't crash or bring down your database server during the locking, the consequence can be huge if the session that holds the lock does not release it. To try this, simply: mysql> FLUSH TABLES WITH READ LOCK; mysql> exit Related resources  Architecting for Failure - Disaster Recovery of MySQL/MariaDB Galera Cluster  Become a MySQL DBA blog series - Understanding the MySQL Error Log  Become a MySQL DBA blog series - Database High Availability Then send a bunch of new queries to the mysqld until it reaches the max_connections value. Obviously, you can not get back the same session as the previous one once you are out. So the lock will be running infinitely and the only way to release the lock is by killing the query, by another SUPER privilege user (using another session). Or kill the mysqld process itself, or perform a hard reboot. Disclaimer This blog is written to give alternatives to sysadmins and DBAs to simulate failure scenarios with MySQL. Do not try these on your production server :-) Tags:  galera MySQL crash failure recovery rds testing

  • This Week in Data with Colin Charles 15: Percona Live 2018 Call for Papers and Best Practices for Observability
    Join Percona Chief Evangelist Colin Charles as he covers happenings, gives pointers and provides musings on the open source database community. So we have announced the call for presentations for Percona Live Santa Clara 2018. Please send your submissions in! As you probably already know, we have been expanding the content to be more than just MySQL and MongoDB. It really does include more open source databases: the whole of 2016 had a “time series” theme to it, and we of course love to have more PostgreSQL content (there have been tracks dedicated to PostgreSQL for sometime now). I found this one comment interesting recently, from John Arundel, “If you’re going to learn one database really well, make it Postgres.” I have been noticing newer developers jump on the PostgreSQL bandwagon. I presume much of this blog’s readership is still MySQL centric, but it will be interesting to see where this goes. Charity Majors recently wrote Best Practices for Observability. In addition, her book alongside Laine Campbell is now available for purchase on Kindle: Database Reliability Engineering: Designing and Operating Resilient Database Systems. Highly recommended purchase. You can also get it on O’Reilly Safari (free month with those codes for Percona Live Europe Dublin attendees). Are you using Google Cloud Spanner? It now has multi-region support, and has an updated SLA for 99.999% uptime. That’s basically no more than 5.25 minutes of downtime per year! Releases orchestrator 3.0.3 – auto-provisioning Raft nodes, native Consul support, SQLite or MySQL backed setups, web UI improvements and more. Solid release. MongoDB 3.6 – you can download this soon. MariaDB 10.1.29 – important changes to Mariabackup, InnoDB/XtraDB, and some security fixes Apache Kylin 2.2 – OLAP for Hadoop, originally developed at eBay, has enhanced ACL support amongst other improvements. Cassandra on Azure Cosmos DB Link List What even is streaming data and Kafka? What is a message bus? Everything you ever wanted to know about modern event-driven architectures but were afraid to ask. – recommended read, checkout the videos attached as well. Microsoft is rocking it when it comes to open source – checkout SQL Operations Studio, where there is already an issue to encourage support for PostgreSQL, MySQL, and SQLite. Microsoft joins the MariaDB Foundation as a platinum sponsor and also offers this on Azure. This wasn’t without confusion, related to the Ars Technica piece. I enjoy reading the comments, including the one on Hacker News. Gabriela Ferrara has a great slide deck on MySQL 8.0 Preview: What is Coming. Matt Aslett published Database market set for seismic shift driven by accelerated cloud and SaaS adoption, and if for some reason you can’t view it, the tweeted graph is very useful. Understanding caching in Postgres – An in-depth guide Upcoming Appearances ACMUG 2017 gathering – Beijing, China, December 9-10 2017 – it was very exciting being there in 2016, I can only imagine it’s going to be be bigger and better for 2017, since it is now two days long! Feedback I look forward to feedback/tips via e-mail at colin.charles@percona.com or on Twitter @bytebot.

  • SQLyog Thanksgiving Sale has begun
    Happy Thanksgiving! We know everyone likes spending quality time with their family at the holidays. That’s why we thought you might like to start using SQLyog MySQL GUI as you try to accomplish more work in less time. Get flat 30% off on SQLyog purchases and renewals. Use coupon code: TG30 Upgrade your SQLyog community edition to the commercial version to save tons of time on a daily basis. Purchase here. If you’re an existing customer and want to extend your license – head right away to Customer Area. Hurry up, this offer is valid until 23rd November, 23:59 PST. Cheers,Team SQLyog The post SQLyog Thanksgiving Sale has begun appeared first on SQLyog Blog.

  • Optimization to skip index dives with FORCE INDEX.
    Index dives are performed during the optimization phase to help decide which index to use. Currently when user specifies FORCE INDEX, optimizer still always calculates cost using index dives. Under some circumstances it is possible to avoid the index dives and this could speed up execution.…

  • A new optimization in gcc 5.x and MySQL
    We were testing the impact of changing from gcc 4.9 to 5.x and from older to newer jemalloc. For one of the in-memory sysbench tests the QPS at high-concurrency dropped by 10%. The test was read-only with range-size set to 1000. If the test was limited to the order-ranges query then the QPS dropped by ~25%. This wasn't good.I repeated the test with newer jemalloc and gcc 4.9 and there was no loss of QPS. So now it looked like a problem with gcc 5.x and not jemalloc. I then did a build with tcmalloc, but at startup mysqld would get an illegal free error from tcmalloc. After finding what I think is a RocksDB bug and then another problem, I found a workaround and soon discovered that for tcmalloc and glibc malloc there was also a decrease in QPS for gcc 5.x but not 4.9. Now I was worried that I might lose the debugging expertise of the internal jemalloc team at work, but fortunately they found the problem while collaborating with the MyRocks team.AFAIK this isn't a MyRocks-only problem because it comes from the allocation done to sort for the order by clause. But I am tired of running tests and won't test it for InnoDB. Good news for the rest of the world. This is an issue for MySQL 5.6 but probably not for 5.7 and 8.x.The problem is a gcc5 optimization (see gcc issues 67618 and 83022) that transforms the call sequence malloc, memset into a call to calloc. This appears to be done even for 2mb allocations (my.cnf had sort_buffer_size=2m). The output from perf wasn't clear about the problem this creates. For jemalloc it reported a new CPU overhead from smp_call_function_interrupt calling flush_tlb_func and all of that is kernel code. I was told that was from jemalloc zero-ing pages. A workaround that doesn't require a code change is to compile with -fno-builtin-malloc. There are workarounds that require code changes that I won't list here.Here are performance results from mysqld compiled with gcc 4.9 vs 5.x and linked with different allocators (jemalloc, tcmalloc, glibc malloc). The test is in-memory sysbench read-only with range-size=1000, 8 tables and 1M rows/table. The test uses 48 concurrent connections and the server has 48 HW threads (24 cores, HT enabled).QPS     configuration78622   gcc4.9, jemalloc76787   gcc4.9, tcmalloc73340   gcc4.9, glibc malloc-65673   gcc5.x, jemalloc58958   gcc5.x, tcmalloc48750   gcc5.x, glibc malloc-78028   gcc5.x, jemalloc, -fno-builtin-malloc78135   gcc5.x, tcmalloc, -fno-builtin-malloc78207   gcc5.x, glibc malloc, -fno-builtin-malloc

Copyright © 2017 andhrabay.com. All Rights Reserved.
Joomla! is Free Software released under the GNU/GPL License.