MySQL Interview Questions and Answers

What's MySQL ?
MySQL (pronounced "my ess cue el") is an open source relational database management system (RDBMS) that uses Structured Query Language (SQL), the most popular language for adding, accessing, and processing data in a database. Because it is open source, anyone can download MySQL and tailor it to their needs in accordance with the general public license. MySQL is noted mainly for its speed, reliability, and flexibility. ...

What is DDL, DML and DCL ?
If you look at the large variety of SQL commands, they can be divided into three large subgroups. Data Definition Language deals with database schemas and descriptions of how the data should reside in the database, therefore language statements like CREATE TABLE or ALTER TABLE belong to DDL. DML deals with data manipulation, and therefore includes most common SQL statements such SELECT, INSERT, etc. Data Control Language includes commands such as GRANT, and mostly concerns with rights, permissions and other controls of the database system.

How do you get the number of rows affected by query?
SELECT COUNT (user_id) FROM users would only return the number of user_id’s.

If the value in the column is repeatable, how do you find out the unique values?
Use DISTINCT in the query, such as SELECT DISTINCT user_firstname FROM users; You can also ask for a number of distinct values by saying SELECT COUNT (DISTINCT user_firstname) FROM users;

How do you return the a hundred books starting from 25th?
SELECT book_title FROM books LIMIT 25, 100. The first number in LIMIT is the offset, the second is the number.

You wrote a search engine that should retrieve 10 results at a time, but at the same time you’d like to know how many rows there’re total. How do you display that to the user?
SELECT SQL_CALC_FOUND_ROWS page_title FROM web_pages LIMIT 1,10; SELECT FOUND_ROWS(); The second query (not that COUNT() is never used) will tell you how many results there’re total, so you can display a phrase "Found 13,450,600 results, displaying 1-10". Note that FOUND_ROWS does not pay attention to the LIMITs you specified and always returns the total number of rows affected by query.

How would you write a query to select all teams that won either 2, 4, 6 or 8 games?
SELECT team_name FROM teams WHERE team_won IN (2, 4, 6, 8)

How would you select all the users, whose phone number is null?
SELECT user_name FROM users WHERE ISNULL(user_phonenumber);

What does this query mean: SELECT user_name, user_isp FROM users LEFT JOIN isps USING (user_id) ?
It’s equivalent to saying SELECT user_name, user_isp FROM users LEFT JOIN isps WHERE users.user_id=isps.user_id

How do you find out which auto increment was assigned on the last insert?

SELECT LAST_INSERT_ID() will return the last value assigned by the auto_increment function. Note that you don’t have to specify the table name.

What does –i-am-a-dummy flag to do when starting MySQL?
Makes the MySQL engine refuse UPDATE and DELETE commands where the WHERE clause is not present.

On executing the DELETE statement I keep getting the error about foreign key constraint failing. What do I do?
What it means is that so of the data that you’re trying to delete is still alive in another table. Like if you have a table for universities and a table for students, which contains the ID of the university they go to, running a delete on a university table will fail if the students table still contains people enrolled at that university. Proper way to do it would be to delete the offending data first, and then delete the university in question. Quick way would involve running SET foreign_key_checks=0 before the DELETE command, and setting the parameter back to 1 after the DELETE is done. If your foreign key was formulated with ON DELETE CASCADE, the data in dependent tables will be removed automatically.

When would you use ORDER BY in DELETE statement?
When you’re not deleting by row ID. Such as in DELETE FROM techpreparation_com_questions ORDER BY timestamp LIMIT 1. This will delete the most recently posted question in the table techpreparation_com_questions.

How can you see all indexes defined for a table?
SHOW INDEX FROM techpreparation_questions;

How would you change a column from VARCHAR(10) to VARCHAR(50)?
ALTER TABLE techpreparation_questions CHANGE techpreparation_content techpreparation_CONTENT VARCHAR(50).

How would you delete a column?
ALTER TABLE techpreparation_answers DROP answer_user_id.

How would you change a table to InnoDB?
ALTER TABLE techpreparation_questions ENGINE innodb;

When you create a table, and then run SHOW CREATE TABLE on it, you occasionally get different results than what you typed in. What does MySQL modify in your newly created tables?
1. VARCHARs with length less than 4 become CHARs
2. CHARs with length more than 3 become VARCHARs.
3. NOT NULL gets added to the columns declared as PRIMARY KEYs
4. Default values such as NULL are specified for each column

How do I find out all databases starting with ‘tech’ to which I have access to?
SHOW DATABASES LIKE ‘tech%’;

How do you concatenate strings in MySQL?
CONCAT (string1, string2, string3)

How do you get a portion of a string?
SELECT SUBSTR(title, 1, 10) from techpreparation_questions;

What’s the difference between CHAR_LENGTH and LENGTH?
The first is, naturally, the character count. The second is byte count. For the Latin characters the numbers are the same, but they’re not the same for Unicode and other encodings.

How do you convert a string to UTF-8?

SELECT (techpreparation_question USING utf8);

What do % and _ mean inside LIKE statement?
% corresponds to 0 or more characters, _ is exactly one character.

What does + mean in REGEXP?
At least one character. Appendix G. Regular Expressions from MySQL manual is worth perusing before the interview.

How do you get the month from a timestamp?
SELECT MONTH(techpreparation_timestamp) from techpreparation_questions;

How do you offload the time/date handling to MySQL?
SELECT DATE_FORMAT(techpreparation_timestamp, ‘%Y-%m-%d’) from techpreparation_questions; A similar TIME_FORMAT function deals with time.

How do you add three minutes to a date?
ADDDATE(techpreparation_publication_date, INTERVAL 3 MINUTE)

What’s the difference between Unix timestamps and MySQL timestamps?
Internally Unix timestamps are stored as 32-bit integers, while MySQL timestamps are stored in a similar manner, but represented in readable YYYY-MM-DD HH:MM:SS format.

How do you convert between Unix timestamps and MySQL timestamps?
UNIX_TIMESTAMP converts from MySQL timestamp to Unix timestamp, FROM_UNIXTIME converts from Unix timestamp to MySQL timestamp.

What are ENUMs used for in MySQL?
You can limit the possible values that go into the table. CREATE TABLE months (month ENUM ‘January’, ‘February’, ‘March’,…); INSERT months VALUES (’April’);

How are ENUMs and SETs represented internally?
As unique integers representing the powers of two, due to storage optimizations.

How do you start and stop MySQL on Windows?
net start MySQL, net stop MySQL

How do you start MySQL on Linux?
/etc/init.d/mysql start

Explain the difference between mysql and mysql interfaces in PHP?
mysqli is the object-oriented version of mysql library functions.

What’s the default port for MySQL Server?
3306

What does tee command do in MySQL?
tee followed by a filename turns on MySQL logging to a specified file. It can be stopped by command note.

Can you save your connection settings to a conf file?
Yes, and name it ~/.my.conf. You might want to change the permissions on the file to 600, so that it’s not readable by others.

How do you change a password for an existing user via mysqladmin?
mysqladmin -u root -p password "newpassword"

Use mysqldump to create a copy of the database?
mysqldump -h mysqlhost -u username -p mydatabasename > dbdump.sql

Have you ever used MySQL Administrator and MySQL Query Browser?
Describe the tasks you accomplished with these tools.

What are some good ideas regarding user security in MySQL?
There is no user without a password. There is no user without a user name. There is no user whose Host column contains % (which here indicates that the user can log in from anywhere in the network or the Internet). There are as few users as possible (in the ideal case only root) who have unrestricted access.

Explain the difference between MyISAM Static and MyISAM Dynamic. ?
In MyISAM static all the fields have fixed width. The Dynamic MyISAM table would include fields such as TEXT, BLOB, etc. to accommodate the data types with various lengths. MyISAM Static would be easier to restore in case of corruption, since even though you might lose some data, you know exactly where to look for the beginning of the next record.

What does myisamchk do?
It compressed the MyISAM tables, which reduces their disk usage.

Explain advantages of InnoDB over MyISAM?
Row-level locking, transactions, foreign key constraints and crash recovery.

Explain advantages of MyISAM over InnoDB?
Much more conservative approach to disk space management - each MyISAM table is stored in a separate file, which could be compressed then with myisamchk if needed. With InnoDB the tables are stored in tablespace, and not much further optimization is possible. All data except for TEXT and BLOB can occupy 8,000 bytes at most. No full text indexing is available for InnoDB. TRhe COUNT(*)s execute slower than in MyISAM due to tablespace complexity.

General Information About MySQL
MySQL is a very fast, multi-threaded, multi-user, and robust SQL (Structured Query Language) database server.
MySQL is free software. It is licensed with the GNU GENERAL PUBLIC LICENSE http://www.gnu.org/.

What Is MySQL
MySQL, the most popular Open Source SQL database, is provided by MySQL AB. MySQL AB is a commercial company that builds is business providing services around the MySQL database. See section 1.2 What Is MySQL AB.

ySQL is a database management system.
A database is a structured collection of data. It may be anything from a simple shopping list to a picture gallery or the vast amounts of information in a corporate network. To add, access, and process data stored in a computer database, you need a database management system such as MySQL. Since computers are very good at handling large amounts of data, database management plays a central role in computing, as stand-alone utilities, or as parts of other applications.

MySQL is a relational database management system.
A relational database stores data in separate tables rather than putting all the data in one big storeroom. This adds speed and flexibility. The tables are linked by defined relations making it possible to combine data from several tables on request. The SQL part of MySQL stands for "Structured Query Language" - the most common standardized language used to access databases.

MySQL is Open Source Software.
Open source means that it is possible for anyone to use and modify. Anybody can download MySQL from the Internet and use it without paying anything. Anybody so inclined can study the source code and change it to fit their needs. MySQL uses the GPL (GNU General Public License) http://www.gnu.org, to define what you may and may not do with the software in different situations. If you feel uncomfortable with the GPL or need to embed MySQL into a commercial application you can buy a commercially licensed version from us.

Why use MySQL?
MySQL is very fast, reliable, and easy to use. If that is what you are looking for, you should give it a try. MySQL also has a very practical set of features developed in very close cooperation with our users. You can find a performance comparison of MySQL to some other database managers on our benchmark page. See section 12.7 Using Your Own Benchmarks. MySQL was originally developed to handle very large databases much faster than existing solutions and has been successfully used in highly demanding production environments for several years. Though under constant development, MySQL today offers a rich and very useful set of functions. The connectivity, speed, and security make MySQL highly suited for accessing databases on the Internet.

The technical features of MySQL
For advanced technical information, see section 7 MySQL Language Reference. MySQL is a client/server system that consists of a multi-threaded SQL server that supports different backends, several different client programs and libraries, administrative tools, and a programming interface. We also provide MySQL as a multi-threaded library which you can link into your application to get a smaller, faster, easier to manage product. MySQL has a lot of contributed software available.
It is very likely that you will find that your favorite application/language already supports MySQL. The official way to pronounce MySQL is ``My Ess Que Ell'' (not MY-SEQUEL). But we try to avoid correcting people who say MY-SEQUEL.

he following list describes some of the important characteristics of MySQL:
Fully multi-threaded using kernel threads. That means it can easily use multiple CPUs if available.
C, C++, Eiffel, Java, Perl, PHP, Python and Tcl APIs.
Works on many different platforms.
Many column types: signed/unsigned integers 1, 2, 3, 4, and 8 bytes long, FLOAT, DOUBLE, CHAR, VARCHAR, TEXT, BLOB, DATE, TIME, DATETIME, TIMESTAMP, YEAR, SET, and ENUM types.
Very fast joins using an optimized one-sweep multi-join.
Full operator and function support in the SELECT and WHERE parts of queries. Example:
mysql> SELECT CONCAT(first_name, " ", last_name) FROM tbl_name
WHERE income/dependents > 10000 AND age > 30;
SQL functions are implemented through a highly optimized class library and should be as fast as they can get! Usually there shouldn't be any memory allocation at all after query initialization.
Full support for SQL GROUP BY and ORDER BY clauses. Support for group functions (COUNT(), COUNT(DISTINCT), AVG(), STD(), SUM(), MAX() and MIN()).
Support for LEFT OUTER JOIN and RIGHT OUTER JOIN with ANSI SQL and ODBC syntax.
You can mix tables from different databases in the same query (as of Version 3.22).
A privilege and password system that is very flexible and secure and allows host-based verification. Passwords are secure because all password traffic is encrypted when you connect to a server.
ODBC (Open-DataBase-Connectivity) support for Win32 (with source). All ODBC 2.5 functions and many others. For example, you can use MS Access to connect to your MySQL server. See section 18 MySQL ODBC Support.
Very fast B-tree disk tables with index compression.
Up to 32 indexes per table are allowed. Each index may consist of 1 to 16 columns or parts of columns. The maximum index length is 500 bytes (this may be changed when compiling MySQL). An index may use a prefix of a CHAR or VARCHAR field. Fixed-length and variable-length records.
In-memory hash tables which are used as temporary tables.
Handles large databases. We are using MySQL with some databases that contain 50,000,000 records and we know of users that uses MySQL with 60,000 tables and about 5,000,000,000 rows
All columns have default values. You can use INSERT to insert a subset of a table's columns; those columns that are not explicitly given values are set to their default values.
Uses GNU Automake, Autoconf, and libtool for portability.
Written in C and C++. Tested with a broad range of different compilers.
A very fast thread-based memory allocation system.
No memory leaks. Tested with a commercial memory leakage detector (purify).
Includes myisamchk, a very fast utility for table checking, optimization, and repair. See section 15 Maintaining a MySQL Installation.
Full support for several different character sets, including ISO-8859-1 (Latin1), big5, ujis, and more. For example, the Scandinavian characters `@ringaccent{a}', `@"a' and `@"o' are allowed in table and column names.
All data are saved in the chosen character set. All comparisons for normal string columns are case insensitive.
Sorting is done according to the chosen character set (the Swedish way by default). It is possible to change this when the MySQL server is started up. To see an example of very advanced sorting, look at the Czech sorting code. MySQL supports many different character sets that can be specified at compile and run time.
Aliases on tables and columns are allowed as in the SQL92 standard.
DELETE, INSERT, REPLACE, and UPDATE return how many rows were changed (affected). It is possible to return the number of rows matched instead by setting a flag when connecting to the server.
Function names do not clash with table or column names. For example, ABS is a valid column name. The only restriction is that for a function call, no spaces are allowed between the function name and the `(' that follows it. See section 7.39 Is MySQL Picky About Reserved Words?.
All MySQL programs can be invoked with the --help or -? options to obtain online assistance.
The server can provide error messages to clients in many languages. See section 10.1 What Languages Are Supported by MySQL?. Clients may connect to the MySQL server using TCP/IP Sockets, Unix Sockets (Unixes), or Named Pipes (NT). The MySQL-specific SHOW command can be used to retrieve information about databases, tables, and indexes. The EXPLAIN command can be used to determine how the optimizer resolves a query

Database Basics
Databases are managed by a relational database management system (RDBMS). An RDBMS supports a database language to create and delete databases and to manage and search data. The database language used in almost all DBMSs is SQL, a set of statements that define and manipulate data. After creating a database, the most common SQL statements used are INSERT, UPDATE, DELETE, and SELECT, which add, change, remove, and search data in a database, respectively.

Database
A repository to store data.

Table
The part of a database that stores the data. A table has columns or attributes, and the data stored in rows.

Attributes
The columns in a table. All rows in table entities have the same attributes. For example, a customer table might have the attributes name, address, and city. Each attribute has a data type such as string, integer, or date.

Rows
The data entries in a table. Rows contain values for each attribute. For example, a row in a customer table might contain the values "Matthew Richardson," "Punt Road," and "Richmond." Rows are also known as records.

Relational model
A model that uses tables to store data and manage the relationship between tables.

Relational database management system
A software system that manages data in a database and is based on the relational model. DBMSs have several components described in detail in Chapter 1.

SQL
A query language that interacts with a DBMS. SQL is a set of statements to manage databases, tables, and data.

Constraints
Restrictions or limitations on tables and attributes. For example, a wine can be produced only by one winery, an order for wine can't exist if it isn't associated with a customer, having a name attribute could be mandatory for a customer.

Primary key
One or more attributes that contain values that uniquely identify each row. For example, a customer table might have the primary key of cust ID. The cust ID attribute is then assigned a unique value for each customer. A primary key is a constraint of most tables.

Index
A data structure used for fast access to rows in a table. An index is usually built for the primary key of each table and can then be used to quickly find a particular row. Indexes are also defined and built for other attributes when those attributes are frequently used in queries.

Entity-relationship modeling
A technique used to describe the real-world data in terms of entities, attributes, and relationships.

Normalized database
A correctly designed database that is created from an ER model. There are different types or levels of normalization, and a third-normal form database is generally regarded as being an acceptably designed relational database.

MySQL Command Interpreter
The MySQL command interpreter is commonly used to create databases and tables in web database applications and to test queries. Throughout the remainder of this chapter we discuss the SQL statements for managing a database. All these statements can be directly entered into the command interpreter and executed. The statements can also be included in server-side PHP scripts, as discussed in later chapters.
Once the MySQL DBMS server is running, the command interpreter can be used. The command interpreter can be run using the following command from the shell, assuming you've created a user hugh with a password shhh:
% /usr/local/bin/mysql -uhugh -pshhh The shell prompt is represented here as a percentage character, %.
Running the command interpreter displays the output:
Welcome to the MySQL monitor. Commands end with ; or \g. Your MySQL connection id is 36 to server version: 3.22.38
Type 'help' for help.
mysql>
The command interpreter displays a mysql> prompt and, after executing any command or statement, it redisplays the prompt. For example, you might issue the statement:
mysql> SELECT NOW( );
This statement reports the time and date by producing the following output:
+---------------------+
| NOW( ) |
+---------------------+
| 2002-01-01 13:48:07 |
+---------------------+
1 row in set (0.00 sec)
mysql>
After running a statement, the interpreter redisplays the mysql> prompt. We discuss the SELECT statement later in this chapter.
As with all other SQL statements, the SELECT statement ends in a semicolon. Almost all SQL command interpreters permit any amount of whitespace—spaces, tabs, or carriage returns—in SQL statements, and they check syntax and execute statements only after encountering a semicolon that is followed by a press of the Enter key. We have used uppercase for the SQL statements throughout this book. However, any mix of upper- and lowercase is equivalent.
On startup, the command interpreter encourages the use of the help command. Typing help produces a list of commands that are native to the MySQL interpreter and that aren't part of SQL. All non-SQL commands can be entered without the terminating semicolon, but the semicolon can be included without causing an error.
The MySQL command interpreter allows flexible entry of commands and SQL statements:
The up and down arrow keys allow previously entered commands and statements to be browsed and used.
The interpreter has command completion. If you type the first few characters of a string that has previously been entered and press the Tab key, the interpreter automatically completes the command. For example, if wines is typed and the Tab key pressed, the command interpreter outputs winestore, assuming the word winestore has been previously used.
If there's more than one option that begins with the characters entered, or you wish the strings that match the characters to be displayed, press the Tab key twice to show all matches. You can then enter additional characters to remove any ambiguity and press the Tab key again for command completion.
Several common statements and commands are pre-stored, including most of the SQL keywords discussed in this chapter.
To use the default text editor to create SQL statements, enter the command edit in the interpreter. This invokes the editor defined by the EDITOR shell environment variable. When the editor is exited, the MySQL command interpreter reads, parses, and runs the file created in the editor.
When the interpreter is quit and run again later, the history of commands and statements is kept. It is still possible to scroll up using the up arrow and to execute commands and statements that were entered earlier.
You can run commands and SQL statements without actually launching the MySQL command interpreter. For example, to run SELECT now( ) from the Linux shell, enter the following command:
mysql -ppassword -e "SELECT now( );" This is particularly useful for adding SQL commands to shell or other scripts.

Installing a MySQL Binary Distribution
You need the following tools to install a MySQL binary distribution:
GNU gunzip to uncompress the distribution.
A reasonable tar to unpack the distribution. GNU tar is known to work. Sun tar is known to have problems.
An alternative installation method under Linux is to use RPM (RedHat Package Manager) distributions.
If you run into problems, PLEASE ALWAYS USE mysqlbug when posting questions to mysql@lists.mysql.com. Even if the problem isn't a bug, mysqlbug gathers system information that will help others solve your problem. By not using mysqlbug, you lessen the likelihood of getting a solution to your problem! You will find mysqlbug in the `bin' directory after you unpack the distribution.
The basic commands you must execute to install and use a MySQL binary distribution are:
shell> groupadd mysql
shell> useradd -g mysql mysql
shell> cd /usr/local
shell> gunzip < /path/to/mysql-VERSION-OS.tar.gz | tar xvf -
shell> ln -s mysql-VERSION-OS mysql
shell> cd mysql
shell> scripts/mysql_install_db
shell> chown -R mysql /usr/local/mysql
shell> chgrp -R mysql /usr/local/mysql
shell> bin/safe_mysqld --user=mysql &
You can add new users using the bin/mysql_setpermission script if you install the DBI and Msql-Mysql-modules Perl modules.
A more detailed description follows.
Pick the directory under which you want to unpack the distribution, and move into it. In the example below, we unpack the distribution under `/usr/local' and create a directory `/usr/local/mysql' into which MySQL is installed. (The following instructions therefore assume you have permission to create files in `/usr/local'. If that directory is protected, you will need to perform the installation as root.)
How to Get MySQL. MySQL binary distributions are provided as compressed tar archives and have names like `mysql-VERSION-OS.tar.gz', where VERSION is a number (for example, 3.21.15), and OS indicates the type of operating system for which the distribution is intended (for example, pc-linux-gnu-i586). Add a user and group for mysqld to run as:
shell> groupadd mysql
shell> useradd -g mysql mysql
These commands add the mysql group and the mysql user. The syntax for useradd and groupadd may differ slightly on different Unixes. They may also be called adduser and addgroup. You may wish to call the user and group something else instead of mysql.
Change into the intended installation directory:
shell> cd /usr/local>
Unpack the distribution and create the installation directory:
shell> gunzip < /path/to/mysql-VERSION-OS.tar.gz | tar xvf -
shell> ln -s mysql-VERSION-OS mysql
The first command creates a directory named `mysql-VERSION-OS'. The second command makes a symbolic link to that directory. This lets you refer more easily to the installation directory as `/usr/local/mysql'.
Change into the installation directory:
shell> cd mysql
You will find several files and subdirectories in the mysql directory. The most important for installation purposes are the `bin' and `scripts' subdirectories.
`bin'
This directory contains client programs and the server You should add the full pathname of this directory to your PATH environment variable so that your shell finds the MySQL programs properly.
`scripts'
This directory contains the mysql_install_db script used to initialize the server access permissions.
If you would like to use mysqlaccess and have the MySQL distribution in some nonstandard place, you must change the location where mysqlaccess expects to find the mysql client. Edit the `bin/mysqlaccess' script at approximately line 18. Search for a line that looks like this:
$MYSQL = '/usr/local/bin/mysql'; # path to mysql executable
Change the path to reflect the location where mysql actually is stored on your system. If you do not do this, you will get a Broken pipe error when you run mysqlaccess.
Create the MySQL grant tables (necessary only if you haven't installed MySQL before):
shell> scripts/mysql_install_db
Note that MySQL versions older than Version 3.22.10 started the MySQL server when you run mysql_install_db. This is no longer true! Change ownership of the installation directory to the user that you will run mysqld as:
shell> chown -R mysql /usr/local/mysql
shell> chgrp -R mysql /usr/local/mysql
The first command changes the owner attribute of the files to the mysql user, and the second changes the group attribute to the mysql group.
If you would like MySQL to start automatically when you boot your machine, you can copy support-files/mysql.server to the location where your system has its startup files. More information can be found in the support-files/mysql.server script itself.
After everything has been unpacked and installed, you should initialize and test your distribution.
You can start the MySQL server with the following command:
shell> bin/safe_mysqld --user=mysql &

MySQL - Quick Installation Overview
The basic commands you must execute to install a MySQL source distribution are:
shell> groupadd mysql
shell> useradd -g mysql mysql
shell> gunzip < mysql-VERSION.tar.gz | tar -xvf -
shell> cd mysql-VERSION
shell> ./configure --prefix=/usr/local/mysql
shell> make
shell> make install
shell> scripts/mysql_install_db
shell> chown -R mysql /usr/local/mysql
shell> chgrp -R mysql /usr/local/mysql
shell> /usr/local/mysql/bin/safe_mysqld --user=mysql &
If you start from a source RPM, then do the following:
shell> rpm --rebuild MySQL-VERSION.src.rpm
This will make a binary RPM that you can install.
You can add new users using the bin/mysql_setpermission script if you install the DBI and Msql-Mysql-modules Perl modules.
A more detailed description follows.
Pick the directory under which you want to unpack the distribution, and move into it.
If you are interested in using Berkeley DB tables with MySQL, you will need to obtain a patched version of the Berkeley DB source code. Please read the chapter on Berkeley DB tables before proceeding.
MySQL source distributions are provided as compressed tar archives and have names like `mysql-VERSION.tar.gz', where VERSION is a number like 3.23.33.
Add a user and group for mysqld to run as:
shell> groupadd mysql
shell> useradd -g mysql mysql
These commands add the mysql group, and the mysql user. The syntax for useradd and groupadd may differ slightly on different Unixes. They may also be called adduser and addgroup. You may wish to call the user and group something else instead of mysql.
Unpack the distribution into the current directory:
shell> gunzip < /path/to/mysql-VERSION.tar.gz | tar xvf -
This command creates a directory named `mysql-VERSION'.
Change into the top-level directory of the unpacked distribution:
shell> cd mysql-VERSION
Note that currently you must configure and build MySQL from this top-level directory. You can not build it in a different directory.
Configure the release and compile everything:
shell> ./configure --prefix=/usr/local/mysql
shell> make
When you run configure, you might want to specify some options. Run ./configure --help for a list of options. If configure fails, and you are going to send mail to
mysql@lists.mysql.com to ask for assistance, please include any lines from `config.log' that you think can help solve the problem. Also include the last couple of lines of output from configure if configure aborts. Post the bug report using the mysqlbug script.
Install everything:
shell> make install
You might need to run this command as root.
Create the MySQL grant tables (necessary only if you haven't installed MySQL before):
shell> scripts/mysql_install_db
Note that MySQL versions older than Version 3.22.10 started the MySQL server when you run mysql_install_db. This is no longer true!
Change ownership of the installation to the user that you will run mysqld as:
shell> chown -R mysql /usr/local/mysql
shell> chgrp -R mysql /usr/local/mysql
The first command changes the owner attribute of the files to the mysql user, and the second changes the group attribute to the mysql group.
If you would like MySQL to start automatically when you boot your machine, you can copy support-files/mysql.server to the location where your system has its startup files. More information can be found in the support-files/mysql.server script itself.
After everything has been installed, you should initialize and test your distribution:
shell> /usr/local/mysql/bin/safe_mysqld --user=mysql &
If that command fails immediately with mysqld daemon ended then you can find some information in the file `mysql-data-directory/'hostname'.err'. The likely reason is that you already have another mysqld server running.

MySQL - SELECT INTO TABLE
MySQL doesn't yet support the Oracle SQL extension: SELECT ... INTO TABLE .... MySQL supports instead the ANSI SQL syntax INSERT INTO ... SELECT ..., which is basically the same thing.
Alternatively, you can use SELECT INTO OUTFILE... or CREATE TABLE ... SELECT to solve your problem.

MySQL - Transactions
As MySQL does nowadays support transactions, the following discussion is only valid if you are only using the non-transaction-safe table types.
The question is often asked, by the curious and the critical, ``Why is MySQL not a transactional database?'' or ``Why does MySQL not support transactions?''
MySQL has made a conscious decision to support another paradigm for data integrity, ``atomic operations.'' It is our thinking and experience that atomic operations offer equal or even better integrity with much better performance. We, nonetheless, appreciate and understand the transactional database paradigm and plan, within the next few releases, to introduce transaction-safe tables on a per table basis. We will be giving our users the possibility to decide if they need the speed of atomic operations or if they need to use transactional features in their applications.
How does one use the features of MySQL to maintain rigorous integrity and how do these features compare with the transactional paradigm?
First, in the transactional paradigm, if your applications are written in a way that is dependent on the calling of ``rollback'' instead of ``commit'' in critical situations, then transactions are more convenient. Moreover, transactions ensure that unfinished updates or corrupting activities are not committed to the database; the server is given the opportunity to do an automatic rollback and your database is saved.
MySQL, in almost all cases, allows you to solve for potential problems by including simple checks before updates and by running simple scripts that check the databases for inconsistencies and automatically repair or warn if such occurs. Note that just by using the MySQL log or even adding one extra log, one can normally fix tables perfectly with no data integrity loss.
Moreover, fatal transactional updates can be rewritten to be atomic. In fact,we will go so far as to say that all integrity problems that transactions solve can be done with LOCK TABLES or atomic updates, ensuring that you never will get an automatic abort from the database, which is a common problem with transactional databases.
Not even transactions can prevent all loss if the server goes down. In such cases even a transactional system can lose data. The difference between different systems lies in just how small the time-lap is where they could lose data. No system is 100% secure, only ``secure enough.'' Even Oracle, reputed to be the safest of transactional databases, is reported to sometimes lose data in such situations.
To be safe with MySQL, you only need to have backups and have the update logging turned on. With this you can recover from any situation that you could with any transactional database. It is, of course, always good to have backups, independent of which database you use.
The transactional paradigm has its benefits and its drawbacks. Many users and application developers depend on the ease with which they can code around problems where an abort appears to be, or is necessary, and they may have to do a little more work with MySQL to either think differently or write more. If you are new to the atomic operations paradigm, or more familiar or more comfortable with transactions, do not jump to the conclusion that MySQL has not addressed these issues. Reliability and integrity are foremost in our minds. Recent estimates indicate that there are more than 1,000,000 mysqld servers currently running, many of which are in production environments. We hear very, very seldom from our users that they have lost any data, and in almost all of those cases user error is involved. This is, in our opinion, the best proof of MySQL's stability and reliability.
Lastly, in situations where integrity is of highest importance, MySQL's current features allow for transaction-level or better reliability and integrity. If you lock tables with LOCK TABLES, all updates will stall until any integrity checks are made. If you only obtain a read lock (as opposed to a write lock), then reads and inserts are still allowed to happen. The new inserted records will not be seen by any of the clients that have a READ lock until they release their read locks. With INSERT DELAYED you can queue inserts into a local queue, until the locks are released, without having the client wait for the insert to complete.
``Atomic,'' in the sense that we mean it, is nothing magical. It only means that you can be sure that while each specific update is running, no other user can interfere with it, and there will never be an automatic rollback (which can happen on transaction based systems if you are not very careful). MySQL also guarantees that there will not be any dirty reads. You can find some example of how to write atomic updates in the commit-rollback section.
We have thought quite a bit about integrity and performance, and we believe that our atomic operations paradigm allows for both high reliability and extremely high performance, on the order of three to five times the speed of the fastest and most optimally tuned of transactional databases. We didn't leave out transactions because they are hard to do. The main reason we went with atomic operations as opposed to transactions is that by doing this we could apply many speed optimizations that would not otherwise have been possible.
Many of our users who have speed foremost in their minds are not at all concerned about transactions. For them transactions are not an issue. For those of our users who are concerned with or have wondered about transactions vis-a-vis MySQL, there is a ``MySQL way'' as we have outlined above. For those where safety is more important than speed, we recommend them to use the BDB tables for all their critical data.
One final note: We are currently working on a safe replication schema that we believe to be better than any commercial replication system we know of. This system will work most reliably under the atomic operations, non-transactional, paradigm. Stay tuned.

MySQL - Stored Procedures and Triggers
A stored procedure is a set of SQL commands that can be compiled and stored in the server. Once this has been done, clients don't need to keep reissuing the entire query but can refer to the stored procedure. This provides better performance because the query has to be parsed only once, and less information needs to be sent between the server and the client. You can also raise the conceptual level by having libraries of functions in the server.
A trigger is a stored procedure that is invoked when a particular event occurs. For example, you can install a stored procedure that is triggered each time a record is deleted from a transaction table and that automatically deletes the corresponding customer from a customer table when all his transactions are deleted.
The planned update language will be able to handle stored procedures, but without triggers. Triggers usually slow down everything, even queries for which they are not needed.

MySQL - Foreign Keys
Note that foreign keys in SQL are not used to join tables, but are used mostly for checking referential integrity (foreign key constraints). If you want to get results from multiple tables from a SELECT statement, you do this by joining tables:
SELECT * from table1,table2 where table1.id = table2.id;
The FOREIGN KEY syntax in MySQL exists only for compatibility with other SQL vendors' CREATE TABLE commands; it doesn't do anything. The FOREIGN KEY syntax without ON DELETE ... is mostly used for documentation purposes. Some ODBC applications may use this to produce automatic WHERE clauses, but this is usually easy to override. FOREIGN KEY is sometimes used as a constraint check, but this check is unnecessary in practice if rows are inserted into the tables in the right order. MySQL only supports these clauses because some applications require them to exist (regardless of whether or not they work).
In MySQL, you can work around the problem of ON DELETE ... not being implemented by adding the appropriate DELETE statement to an application when you delete records from a table that has a foreign key. In practice this is as quick (in some cases quicker) and much more portable than using foreign keys.
In the near future we will extend the FOREIGN KEY implementation so that at least the information will be saved in the table specification file and may be retrieved by mysqldump and ODBC. At a later stage we will implement the foreign key constraints for application that can't easily be coded to avoid them.

MySQL - Reasons NOT to Use Foreign Keys constraints
There are so many problems with foreign key constraints that we don't know where to start:
Foreign key constraints make life very complicated, because the foreign key definitions must be stored in a database and implementing them would destroy the whole ``nice approach'' of using files that can be moved, copied, and removed. The speed impact is terrible for INSERT and UPDATE statements, and in this case almost all FOREIGN KEY constraint checks are useless because you usually insert records in the right tables in the right order, anyway. There is also a need to hold locks on many more tables when updating one table, because the side effects can cascade through the entire database. It's MUCH faster to delete records from one table first and subsequently delete them from the other tables.
You can no longer restore a table by doing a full delete from the table and then restoring all records (from a new source or from a backup).
If you use foreign key constraints you can't dump and restore tables unless you do so in a very specific order. It's very easy to do ``allowed'' circular definitions that make the tables impossible to re-create each table with a single create statement, even if the definition works and is usable.
It's very easy to overlook FOREIGN KEY ... ON DELETE rules when one codes an application. It's not unusual that one loses a lot of important information just because a wrong or misused ON DELETE rule.
The only nice aspect of FOREIGN KEY is that it gives ODBC and some other client programs the ability to see how a table is connected and to use this to show connection diagrams and to help in building applicatons.
MySQL will soon store FOREIGN KEY definitions so that a client can ask for and receive an answer about how the original connection was made. The current `.frm' file format does not have any place for it. At a later stage we will implement the foreign key constraints for application that can't easily be coded to avoid them.

MySQL - `--' as the Start of a Comment
MySQL doesn't support views, but this is on the TODO.

MySQL - Views
Some other SQL databases use `--' to start comments. MySQL has `#' as the start comment character, even if the mysql command-line tool removes all lines that start with `--'. You can also use the C comment style /* this is a comment */ with MySQL.
MySQL Version 3.23.3 and above supports the `--' comment style only if the comment is followed by a space. This is because this degenerate comment style has caused many problems with automatically generated SQL queries that have used something like the following code, where we automatically insert the value of the payment for !payment!:
UPDATE tbl_name SET credit=credit-!payment!
What do you think will happen when the value of payment is negative?
Because 1--1 is legal in SQL, we think it is terrible that `--' means start comment.
In MySQL Version 3.23 you can, however, use: 1-- This is a comment
The following discussion only concerns you if you are running a MySQL version earlier than Version 3.23:
If you have a SQL program in a text file that contains `--' comments you should use:
shell> replace " --" " #" < text-file-with-funny-comments.sql \
| mysql database
instead of the usual:
shell> mysql database < text-file-with-funny-comments.sql
You can also edit the command file ``in place'' to change the `--' comments to `#' comments:
shell> replace " --" " #" -- text-file-with-funny-comments.sql
Change them back with this command:
shell> replace " #" " --" -- text-file-with-funny-comments.sql

MySQL - How to Cope Without COMMIT/ROLLBACK
The following mostly applies only for ISAM, MyISAM, and HEAP tables. If you only use transaction-safe tables (BDB tables) in an a update, you can do COMMIT and ROLLBACK also with MySQL.
The problem with handling COMMIT-ROLLBACK efficiently with the above table types would require a completely different table layout than MySQL uses today. The table type would also need extra threads that do automatic cleanups on the tables, and the disk usage would be much higher. This would make these table types about 2-4 times slower than they are today.
For the moment, we prefer implementing the SQL server language (something like stored procedures). With this you would very seldom really need COMMIT-ROLLBACK. This would also give much better performance.
Loops that need transactions normally can be coded with the help of LOCK TABLES, and you don't need cursors when you can update records on the fly.
We at TcX had a greater need for a real fast database than a 100% general database. Whenever we find a way to implement these features without any speed loss, we will probably do it. For the moment, there are many more important things to do. Check the TODO for how we prioritize things at the moment. (Customers with higher levels of support can alter this, so things may be reprioritized.)
The current problem is actually ROLLBACK. Without ROLLBACK, you can do any kind of COMMIT action with LOCK TABLES. To support ROLLBACK with the above table types, MySQL would have to be changed to store all old records that were updated and revert everything back to the starting point if ROLLBACK was issued. For simple cases, this isn't that hard to do (the current isamlog could be used for this purpose), but it would be much more difficult to implement ROLLBACK for ALTER/DROP/CREATE TABLE.
To avoid using ROLLBACK, you can use the following strategy:
Use LOCK TABLES ... to lock all the tables you want to access.
Test conditions.
Update if everything is okay.
Use UNLOCK TABLES to release your locks.
This is usually a much faster method than using transactions with possible ROLLBACKs, although not always. The only situation this solution doesn't handle is when someone kills the threads in the middle of an update. In this case, all locks will be released but some of the updates may not have been executed.
You can also use functions to update records in a single operation. You can get a very efficient application by using the following techniques:
Modify fields relative to their current value.
Update only those fields that actually have changed.
For example, when we are doing updates to some customer information, we update only the customer data that has changed and test only that none of the changed data, or data that depend on the changed data, has changed compared to the original row. The test for changed data is done with the WHERE clause in the UPDATE statement. If the record wasn't updated, we give the client a message: "Some of the data you have changed have been changed by another user". Then we show the old row versus the new row in a window, so the user can decide which version of the customer record he should use.
This gives us something that is similar to column locking but is actually even better, because we only update some of the columns, using values that are relative to their current values. This means that typical UPDATE statements look something like these:
UPDATE tablename SET pay_back=pay_back+'relative change';
UPDATE customer
SET
customer_date='current_date',
address='new address',
phone='new phone',
money_he_owes_us=money_he_owes_us+'new_money'
WHERE
customer_id=id AND address='old address' AND phone='old phone';
As you can see, this is very efficient and works even if another client has changed the values in the pay_back or money_he_owes_us columns.
In many cases, users have wanted ROLLBACK and/or LOCK TABLES for the purpose of managing unique identifiers for some tables. This can be handled much more efficiently by using an AUTO_INCREMENT column and either the SQL function LAST_INSERT_ID() or the C API function mysql_insert_id().
At MySQL AB, we have never had any need for row-level locking because we have always been able to code around it. Some cases really need row locking, but they are very few. If you want row-level locking, you can use a flag column in the table and do something like this:
UPDATE tbl_name SET row_flag=1 WHERE id=ID;
MySQL returns 1 for the number of affected rows if the row was found and row_flag wasn't already 1 in the original row.
You can think of it as MySQL changed the above query to:
UPDATE tbl_name SET row_flag=1 WHERE id=ID and row_flag <> 1;

MySQL - General Security
Anyone using MySQL on a computer connected to the Internet should read this section to avoid the most common security mistakes.
In discussing security, we emphasize the necessity of fully protecting the entire server host (not simply the MySQL server) against all types of applicable attacks: eavesdropping, altering, playback, and denial of service. We do not cover all aspects of availability and fault tolerance here.
MySQL uses Access Control Lists (ACLs) security for all connections, queries, and other operations that a user may attempt to perform. There is also some support for SSL-encrypted connections between MySQL clients and servers. Many of the concepts discussed here are not specific to MySQL at all; the same general ideas apply to almost all applications.
When running MySQL, follow these guidelines whenever possible:
DON'T EVER GIVE ANYONE (EXCEPT THE MySQL ROOT USER) ACCESS TO THE mysql.user TABLE! The encrypted password is the real password in MySQL. If you know this for one user you can easily login as him if you have access to his 'host'.
Learn the MySQL access privilege system. The GRANT and REVOKE commands are used for restricting access to MySQL. Do not grant any more privileges than necessary. Never grant privileges to all hosts. Checklist:
Try mysql -u root. If you are able to connect successfully to the server without being asked for a password, you have problems. Any user (not just root) can connect to your MySQL server with full privileges! Review the MySQL installation instructions, paying particular attention to the item about setting a root password.
Use the command SHOW GRANTS and check to see who has access to what. Remove those privileges that are not necessary using the REVOKE command.
Do not keep any plain-text passwords in your database. When your computer becomes compromised, the intruder can take the full list of passwords and use them. Instead use MD5() or another one-way hashing function.
Do not use passwords from dictionaries. There are special programs to break them. Even passwords like ``xfish98'' are very bad. Much better is ``duag98'' which contains the same word ``fish'' but typed one key to the left on a standard QWERTY keyboard. Another method is to use ``Mhall'' which is taken from the first characters of of each word in the sentence ``Mary had a little lamb.'' This is easy to remember and type, but hard to guess for someone who does not know it.
Invest in a firewall. This protects from at least 50% of all types of exploits in any software. Put MySQL behind the firewall or in a demilitarized zone (DMZ). Checklist:
Try to scan your ports from the Internet using a tool such as nmap. MySQL uses port 3306 by default. This port should be inaccessible from untrusted hosts. Another simple way to check whether or not your MySQL port is open is to type telnet server_host 3306 from some remote machine, where server_host is the hostname of your MySQL server. If you get a connection and some garbage characters, the port is open, and should be closed on your firewall or router, unless you really have a good reason to keep it open. If telnet just hangs, everything is OK, the port is blocked.
Do not trust any data entered by your users. They can try to trick your code by entering special or escaped character sequences in Web forms, URLs, or whatever application you have built. Be sure that your application remains secure if a user enters something like ``; DROP DATABASE mysql;''. This is an extreme example, but large security leaks and data loss may occur as a result of hackers using similar techniques, if you do not prepare for them. Also remember to check numeric data. A common mistake is to protect only strings. Sometimes people think that if a database contains only publicly available data that it need not be protected. This is incorrect. At least denial-of-service type attacks can be performed on such databases. The simplest way to protect from this type of attack is to use apostrophes around the numeric constants: SELECT * FROM table WHERE ID='234' instead of SELECT * FROM table WHERE ID=234. MySQL automatically converts this string to a number and strips all non-numeric symbols from it. Checklist:
All WWW applications:
Try to enter `'' and `"' in all your Web forms. If you get any kind of MySQL error, investigate the problem right away.
Try to modify any dynamic URLs by adding %22 (`"'), %23 (`#'), and %27 (`'') in the URL.
Try to modify datatypes in dynamic URLs from numeric ones to character ones containing characters from previous examples.
Your application should be safe against this and similar attacks.
Try to enter characters, spaces, and special symbols instead of numbers in numeric fields. Your application should remove them before passing them to MySQL or your application should generate an error. Passing unchecked values to MySQL is very dangerous!
Check data sizes before passing them to MySQL.
Consider having your application connect to the database using a different user name than the one you use for administrative purposes. Do not give your applications any more access privileges than they need.
Users of PHP:
Check out the addslashes() function.
Users of MySQL C API:
Check out the mysql_escape() API call.
Users of MySQL++:
Check out the escape and quote modifiers for query streams.
Users of Perl DBI:
Check out the quote() method.
Do not transmit plain (unencrypted) data over the Internet. These data are accessible to everyone who has the time and ability to intercept it and use it for their own purposes. Instead, use an encrypted protocol such as SSL or SSH. MySQL supports internal SSL connections as of Version 3.23.9. SSH port-forwarding can be used to create an encrypted (and compressed) tunnel for the communication.
Learn to use the tcpdump and strings utilities. For most cases, you can check whether or not MySQL data streams are unencrypted by issuing a command like the following:
shell> tcpdump -l -i eth0 -w - src or dst port 3306 | strings
(This works under Linux and should work with small modifications under other systems). Warning: If you do not see data this doesn't always actually mean that it is encrypted. If you need high security, you should consult with a security expert.

How to Make MySQL Secure Against Crackers ?
When you connect to a MySQL server, you normally should use a password. The password is not transmitted in clear text over the connection, however the encryption algorithm is not very strong, and with some effort a clever attacker can crack the password if he is able to sniff the traffic between the client and the server. If the connection between the client and the server goes through an untrusted network, you should use an SSH tunnel to encrypt the communication.
All other information is transferred as text that can be read by anyone who is able to watch the connection. If you are concerned about this, you can use the compressed protocol (in MySQL Version 3.22 and above) to make things much harder. To make things even more secure you should use ssh (see http://www.cs.hut.fi/ssh). With this, you can get an encrypted TCP/IP connection between a MySQL server and a MySQL client.
To make a MySQL system secure, you should strongly consider the following suggestions:
Use passwords for all MySQL users. Remember that anyone can log in as any other person as simply as mysql -u other_user db_name if other_user has no password. It is common behavior with client/server applications that the client may specify any user name. You can change the password of all users by editing the mysql_install_db script before you run it, or only the password for the MySQL root user like this:
shell> mysql -u root mysql
mysql> UPDATE user SET Password=PASSWORD('new_password')
WHERE user='root';
mysql> FLUSH PRIVILEGES;
Don't run the MySQL daemon as the Unix root user. It is very dangerous as any user with FILE privileges will be able to create files as root (for example, ~root/.bashrc). To prevent this mysqld will refuse to run as root unless it is specified directly via --user=root option. mysqld can be run as any user instead. You can also create a new Unix user mysql to make everything even more secure. If you run mysqld as another Unix user, you don't need to change the root user name in the user table, because MySQL user names have nothing to do with Unix user names. You can edit the mysql.server script to start mysqld as another Unix user. Normally this is done with the su command.
If you put a password for the Unix root user in the mysql.server script, make sure this script is readable only by root. Check that the Unix user that mysqld runs as is the only user with read/write privileges in the database directories. On Unix platforms, do not run mysqld as root unless you really need to. Consider creating a user named mysql for that purpose.
Don't give the process privilege to all users. The output of mysqladmin processlist shows the text of the currently executing queries, so any user who is allowed to execute that command might be able to see if another user issues an UPDATE user SET password=PASSWORD('not_secure') query. mysqld reserves an extra connection for users who have the process privilege, so that a MySQL root user can log in and check things even if all normal connections are in use. Don't give the file privilege to all users. Any user that has this privilege can write a file anywhere in the file system with the privileges of the mysqld daemon! To make this a bit safer, all files generated with SELECT ... INTO OUTFILE are readable to everyone, and you can't overwrite existing files. The file privilege may also be used to read any file accessible to the Unix user that the server runs as. This could be abused, for example, by using LOAD DATA to load `/etc/passwd' into a table, which can then be read with SELECT.
If you don't trust your DNS, you should use IP numbers instead of hostnames in the grant tables. In principle, the --secure option to mysqld should make hostnames safe. In any case, you should be very careful about creating grant table entries using hostname values that contain wild cards!
If you want to restrict the number of connections for a single user, you can do this by setting the max_user_connections variable in mysqld.

MySQL - Startup options to mysqld which concerns security ?
The following mysqld options affect networking security:
--secure
IP numbers returned by the gethostbyname() system call are checked to make sure they resolve back to the original hostname. This makes it harder for someone on the outside to get access by pretending to be another host. This option also adds some sanity checks of hostnames. The option is turned off by default in MySQL Version 3.21 because sometimes it takes a long time to perform backward resolutions. MySQL Version 3.22 caches hostnames and has this option enabled by default.
--skip-grant-tables
This option causes the server not to use the privilege system at all. This gives everyone full access to all databases! (You can tell a running server to start using the grant tables again by executing mysqladmin flush-privileges or mysqladmin reload.)
--skip-name-resolve
Hostnames are not resolved. All Host column values in the grant tables must be IP numbers or localhost.
--skip-networking
Don't allow TCP/IP connections over the network. All connections to mysqld must be made via Unix sockets. This option is unsuitable for systems that use MIT-pthreads, because the MIT-pthreads package doesn't support Unix sockets.
--skip-show-database
SHOW DATABASE command doesn't return anything.
--safe-show-database
SHOW DATABASE only returns databases for which the user have some kind of privilege.

MySQL - What the Privilege System Does
The primary function of the MySQL privilege system is to authenticate a user connecting from a given host, and to associate that user with privileges on a database such as select, insert, update and delete.
Additional functionality includes the ability to have an anonymous user and to grant privileges for MySQL-specific functions such as LOAD DATA INFILE and administrative operations.

MySQL User Names and Passwords
There are several distinctions between the way user names and passwords are used by MySQL and the way they are used by Unix or Windows:
User names, as used by MySQL for authentication purposes, have nothing to do with Unix user names (login names) or Windows user names. Most MySQL clients by default try to log in using the current Unix user name as the MySQL user name, but that is for convenience only. Client programs allow a different name to be specified with the -u or --user options. This means that you can't make a database secure in any way unless all MySQL user names have passwords. Anyone may attempt to connect to the server using any name, and they will succeed if they specify any name that doesn't have a password. MySQL user names can be up to 16 characters long; Unix user names typically are limited to 8 characters. MySQL passwords have nothing to do with Unix passwords. There is no necessary connection between the password you use to log in to a Unix machine and the password you use to access a database on that machine. MySQL encrypts passwords using a different algorithm than the one used during the Unix login process.
Note that even if the password is
stored 'scrambled', and knowing your 'scrambled' password is enough to be able to connect to the MySQL server!

Connecting to the MySQL Server
MySQL client programs generally require that you specify connection parameters when you want to access a MySQL server: the host you want to connect to, your user name, and your password. For example, the mysql client can be started like this (optional arguments are enclosed between `[' and `]'):
shell> mysql [-h host_name] [-u user_name] [-pyour_pass]
Alternate forms of the -h, -u, and -p options are --host=host_name, --user=user_name, and --password=your_pass. Note that there is no space between -p or --password= and the password following it.
NOTE: Specifying a password on the command line is not secure! Any user on your system may then find out your password by typing a command like: ps auxww.
mysql uses default values for connection parameters that are missing from the command line:
The default hostname is localhost.
The default user name is your Unix login name.
No password is supplied if -p is missing.
Thus, for a Unix user joe, the following commands are equivalent:
shell> mysql -h localhost -u joe
shell> mysql -h localhost
shell> mysql -u joe
shell> mysql
Other MySQL clients behave similarly.
On Unix systems, you can specify different default values to be used when you make a connection, so that you need not enter them on the command line each time you invoke a client program. This can be done in a couple of ways:
You can specify connection parameters in the [client] section of the `.my.cnf' configuration file in your home directory. The relevant section of the file might look like this:
[client]
host=host_name
user=user_name
password=your_pass
You can specify connection parameters using environment variables. The host can be specified for mysql using MYSQL_HOST. The MySQL user name can be specified using USER (this is for Windows only). The password can be specified using MYSQL_PWD (but this is insecure; see the next section).

MySQL - Keeping Your Password Secure
It is inadvisable to specify your password in a way that exposes it to discovery by other users. The methods you can use to specify your password when you run client programs are listed below, along with an assessment of the risks of each method:
Never give a normal user access to the mysql.user table. Knowing the encrypted password for a user makes it possible to login as this user. The passwords are only scrambled so that one shouldn't be able to see the real password you used (if you happen to use a similar password with your other applications).
Use a -pyour_pass or --password=your_pass option on the command line. This is convenient but insecure, because your password becomes visible to system status programs (such as ps) that may be invoked by other users to display command lines. (MySQL clients typically overwrite the command-line argument with zeroes during their initialization sequence, but there is still a brief interval during which the value is visible.)
Use a -p or --password option (with no your_pass value specified). In this case, the client program solicits the password from the terminal:
shell> mysql -u user_name -p
Enter password: ********
The `*' characters represent your password. It is more secure to enter your password this way than to specify it on the command line because it is not visible to other users. However, this method of entering a password is suitable only for programs that you run interactively. If you want to invoke a client from a script that runs non-interactively, there is no opportunity to enter the password from the terminal. On some systems, you may even find that the first line of your script is read and interpreted (incorrectly) as your password!
Store your password in a configuration file. For example, you can list your password in the [client] section of the `.my.cnf' file in your home directory:
[client]
password=your_pass
If you store your password in `.my.cnf', the file should not be group or world readable or writable. Make sure the file's access mode is 400 or 600.
You can store your password in the MYSQL_PWD environment variable, but this method must be considered extremely insecure and should not be used. Some versions of ps include an option to display the environment of running processes; your password will be in plain sight for all to see if you set MYSQL_PWD. Even on systems without such a version of ps, it is unwise to assume there is no other method to observe process environments.
All in all, the safest methods are to have the client program prompt for the password or to specify the password in a properly protected `.my.cnf' file.

Replication in MySQL
One way replication can be used is to increase both robustness and speed. For robustness you can have two systems and can switch to the backup if you have problems with the master. The extra speed is achieved by sending a part of the non-updating queries to the replica server. Of course this only works if non-updating queries dominate, but that is the normal case.
Starting in Version 3.23.15, MySQL supports one-way replication internally. One server acts as the master, while the other acts as the slave. Note that one server could play the roles of master in one pair and slave in the other. The master server keeps a binary log of updates and an index file to binary logs to keep track of log rotation. The slave, upon connecting, informs the master where it left off since the last successfully propagated update, catches up on the updates, and then blocks and waits for the master to notify it of the new updates.
Note that if you are replicating a database, all updates to this database should be done through the master!
On older servers one can use the update log to do simple replication.
Another benefit of using replication is that one can get live backups of the system by doing a backup on a slave instead of doing it on the master.

MySQL - Replication Implementation Overview
MySQL replication is based on the server keeping track of all changes to your database (updates, deletes, etc) in the binary log. and the slave server(s) reading the saved queries from the master server's binary log so that the slave can execute the same queries on its copy of the data.
It is very important to realize that the binary log is simply a record starting from a fixed point in time (the moment you enable binary logging). Any slaves which you set up will need copies of all the data from your master as it existed the moment that you enabled binary logging on the master. If you start your slaves with data that doesn't agree with what was on the master when the binary log was started, your slaves may fail.
A future version (4.0) of MySQL will remove the need to keep a (possibly large) snapshot of data for new slaves that you might wish to set up through the live backup functionality with no locking required. However, at this time, it is necessary to block all writes either with a global read lock or by shutting down the master while taking a snapshot.
Once a slave is properly configured and running, it will simply connect to the master and wait for updates to process. If the master goes away or the slave loses connectivity with your master, it will keep trying to connect every master-connect-retry seconds until it is able to reconnect and resume listening for updates.
Each slave keeps track of where it left off. The master server has no knowledge of how many slaves there are or which ones are up-to-date at any given time.

MySQL - HOWTO
Below is a quick description of how to set up complete replication on your current MySQL server. It assumes you want to replicate all your databases and have not configured replication before. You will need to shutdown your master server briefly to complete the steops outlined below.
Make sure you have a recent version of MySQL installed on the master and slave(s). Use Version 3.23.29 or higher. Previous releases used a different binary log format and had bugs which have been fixed in newer releases. Please, do not report bugs until you have verified that the problem is present in the latest release.
Set up special a replication user on the master with the FILE privilege and permission to connect from all the slaves. If the user is only doing replication (which is recommended), you don't need to grant any additional privileges. For example, to create a user named repl which can access your master from any host, you might use this command:
GRANT FILE ON *.* TO repl@"%" IDENTIFIED BY '';
Shut down MySQL on the master.
mysqladmin -u root -p<password> shutdown
Snapshot all the data on your master server. The easiest way to do this (on Unix) is to simply use tar to produce an archvie of your entrie data directory. The exact data directory location depends on your installation.
tar -cvf /tmp/mysql-snapshot.tar /path/to/data-dir
Windows users can use WinZip or similar software to create an archive of the data directory.
In my.cnf on the master add log-bin and server-id=unique number to the [mysqld] section and restart it. It is very important that the id of the slave is different from the id of the master. Think of server-id as something similar to the IP address - it uniquely identifies the server instance in the comminity of replication partners.
[mysqld]
log-bin
server-id=1
Restart MySQL on the master.
Add the following to my.cnf on the slave(s):
master-host=<hostname of the master>
master-user=<replication user name>
master-password=<replication user password>
master-port=<TCP/IP port for master>
server-id=<some unique number between 2 and 2^32-1>
replacing the values in <> with what is relevant to your system. server-id must be different for each server participating in replication. If you don't specify a server-id, it will be set to 1 if you have not defined master-host, else it will be set to 2. Note that in the case of server-id omission the master will refuse connections from all slaves, and the slave will refuse to connect to a master. Thus, omitting server-id is only good for backup with a binary log.
Copy the snapshot data into your data directory on your slave(s). Make sure that the privileges on the files and directories are correct. The user which MySQL runs as needs to be able to read and write to them, just as on the master.
Restart the slave(s).
After you have done the above, the slave(s) should connect to the master and catch up on any updates which happened since the snapshot was taken.
If you have forgotten to set server-id for the slave you will get the following error in the error log file:
Warning: one should set server_id to a non-0 value if master_host is set.
The server will not act as a slave.
If you have forgot to do this for the master, the slaves will not be able to connect to the master.
If a slave is not able to replicate for any reason, you will find error messages in the error log on the slave.
Once a slave is replicating, you will find a file called master.info in the same directory as your error log. The master.info file is used by the slave to keep track of how much of the master's binary log is has processed. Do not remove or edit the file, unless you really know what you are doing. Even in that case, it is preferred that you use CHANGE MASTER TO command.

MySQL - SQL Commands Related to Replication
Replication can be controlled through the SQL interface. Below is the summary of commands:
Command Description
SLAVE START Starts the slave thread. (Slave)
SLAVE STOP Stops the slave thread. (Slave)
SET SQL_LOG_BIN=0 Disables update logging if the user has process privilege. Ignored otherwise. (Master)
SET SQL_LOG_BIN=1 Re-enables update logging if the user has process privilege. Ignored otherwise. (Master)
SET SQL_SLAVE_SKIP_COUNTER=n Skip the next n events from the master. Only valid when the slave thread is not running, otherwise, gives an error. Useful for recovering from replication glitches.
RESET MASTER Deletes all binary logs listed in the index file, resetting the binlog index file to be empty. In pre-3.23.26 versions, FLUSH MASTER (Master)
RESET SLAVE Makes the slave forget its replication position in the master logs. In pre 3.23.26 versions the command was called FLUSH SLAVE(Slave)
LOAD TABLE tblname FROM MASTER Downloads a copy of the table from master to the slave. (Slave)
CHANGE MASTER TO master_def_list Changes the master parameters to the values specified in master_def_list and restarts the slave thread. master_def_list is a comma-separated list of master_def where master_def is one of the following: MASTER_HOST, MASTER_USER, MASTER_PASSWORD, MASTER_PORT, MASTER_CONNECT_RETRY, MASTER_LOG_FILE, MASTER_LOG_POS. Example:
CHANGE MASTER TO
MASTER_HOST='master2.mycompany.com',
MASTER_USER='replication',
MASTER_PASSWORD='bigs3cret',
MASTER_PORT=3306,
MASTER_LOG_FILE='master2-bin.001',
MASTER_LOG_POS=4;
You only need to specify the values that need to be changed. The values that you omit will stay the same with the exception of when you change the host or the port. In that case, the slave will assume that since you are connecting to a different host or a different port, the master is different. Therefore, the old values of log and position are not applicable anymore, and will automatically be reset to an empty string and 0, respectively (the start values). Note that if you restart the slave, it will remember its last master. If this is not desirable, you should delete the `master.info' file before restarting, and the slave will read its master from my.cnf or the command line. (Slave)
SHOW MASTER STATUS Provides status information on the binlog of the master. (Master)
SHOW SLAVE STATUS Provides status information on essential parameters of the slave thread. (Slave)
SHOW MASTER LOGS Only available starting in Version 3.23.28. Lists the binary logs on the master. You should use this command prior to PURGE MASTER LOGS TO to find out how far you should go.
PURGE MASTER LOGS TO 'logname' Available starting in Version 3.23.28. Deletes all the replication logs that are listed in the log index as being prior to the specified log, and removed them from the log index, so that the given log now becomes first. Example:
PURGE MASTER LOGS TO 'mysql-bin.010'
This command will do nothing and fail with an error if you have an active slave that is currently reading one of the logs you are trying to delete. However, if you have a dormant slave,and happen to purge one of the logs it wants to read, the slave will be unable to replicate once it comes up. The command is safe to run while slaves are replicating - you do not need to stop them. You must first check all the slaves with SHOW SLAVE STATUS to see which log they are on, then do a listing of the logs on the master with SHOW MASTER LOGS, find the earliest log among all the slaves (if all the slaves are up to date, this will be the last log on the list), backup all the logs you are about to delete (optional) and purge up to the target log.

MySQL - Replication FAQ
Why do I sometimes see more than one Binlog_Dump thread on the master after I have restarted the slave?
Binlog_Dump is a continuous process that is handled by the server in the following way:
Catch up on the updates.
Once there are no more updates left, go into pthread_cond_wait(), from which we can be awakened either by an update or a kill.
On wake up, check the reason. If we are not supposed to die, continue the Binlog_dump loop.
If there is some fatal error, such as detecting a dead client, terminate the loop.
So if the slave thread stops on the slave, the corresponding Binlog_Dump thread on the master will not notice it until after at least one update to the master (or a kill), which is needed to wake it up from pthread_cond_wait(). In the meantime, the slave could have opened another connection, which resulted in another Binlog_Dump thread.
The above problem should not be present in Version 3.23.26 and later versions. In Version 3.23.26 we added server-id to each replication server, and now all the old zombie threads are killed on the master when a new replication thread connects from the same slave

How do I rotate replication logs?
In Version 3.23.28 you should use PURGE MASTER LOGS TO command after determining which logs can be deleted, and optionally backing them up first. In earlier versions the process is much more painful, and cannot be safely done without stopping all the slaves in the case that you plan to re-use log names. You will need to stop the slave threads, edit the binary log index file, delete all the old logs, restart the master, start slave threads,and then remove the old log files.

How do I upgrade on a hot replication setup?
If you are upgrading pre-3.23.26 versions, you should just lock the master tables, let the slave catch up, then run FLUSH MASTER on the master, and FLUSH SLAVE on the slave to reset the logs, then restart new versions of the master and the slave. Note that the slave can stay down for some time - since the master is logging all the updates, the slave will be able to catch up once it is up and can connect.
After 3.23.26, we have locked the replication protocol for modifications, so you can upgrade masters and slave on the fly to a newer 3.23 version and you can have different versions of MySQL running on the slave and the master, as long as they are both newer than 3.23.26.

What issues should I be aware of when setting up two-way replication?
MySQL replication currently does not support any locking protocol between master and slave to guarantee the atomicity of a distributed (cross-server) update. In in other words, it is possible for client A to make an update to co-master 1, and in the meantime, before it propagates to co-master 2, client B could make an update to co-master 2 that will make the update of client A work differently than it did on co-master 1. Thus when the update of client A will make it to co-master 2, it will produce tables that will be different than what you have on co-master 1, even after all the updates from co-master 2 have also propagated. So you should not co-chain two servers in a two-way replication relationship, unless you are sure that you updates can safely happen in any order, or unless you take care of mis-ordered updates somehow in the client code.
You must also realize that two-way replication actually does not improve performance very much, if at all, as far as updates are concerned. Both servers need to do the same amount of updates each, as you would have one server do. The only difference is that there will be a little less lock contention, because the updates originating on another server will be serialized in one slave thread. This benefit, though, might be offset by network delays.

How can I use replication to improve performance of my system?
You should set up one server as the master, and direct all writes to it, and configure as many slaves as you have the money and rackspace for, distributing the reads among the master and the slaves. You can also start the slaves with --skip-bdb, --low-priority-updates and --delay-key-write-for-all-tables to get speed improvements for the slave. In this case the slave will use non-transactional MyISAM tables instead of BDB tables to get more speed.

What should I do to prepare my client code to use performance-enhancing replication?
A: If the part of your code that is responsible for database access has been properly abstracted/modularized, converting it to run with the replicated setup should be very smooth and easy - just change the implementation of your database access to read from some slave or the master, and to awlays write to the master. If your code does not have this level of abstraction, setting up a replicated system will give you an opportunity/motivation to it clean up. You should start by creating a wrapper library /module with the following functions:
safe_writer_connect()
safe_reader_connect()
safe_reader_query()
safe_writer_query()
safe_ means that the function will take care of handling all the error conditions.
You should then convert your client code to use the wrapper library. It may be a painful and scary process at first, but it will pay off in the long run. All applications that follow the above pattern will be able to take advantage of one-master/many slaves solution. The code will be a lot easier to maintain, and adding troubleshooting options will be trivial. You will just need to modify one or two functions, for example, to log how long each query took, or which query, among your many thousands, gave you an error. If you have written a lot of code already, you may want to automate the conversion task by using Monty's replace utility, which comes with the standard distribution of MySQL, or just write your own Perl script. Hopefully, your code follows some recognizable pattern. If not, then you are probably better off re-writing it anyway, or at least going through and manually beating it into a pattern.
Note that, of course, you can use different names for the functions. What is important is having unified interface for connecting for reads, connecting for writes, doing a read, and doing a write.

When and how much can MySQL replication improve the performance of my system?
MySQL replication is most beneficial for a system with frequent reads and not so frequent writes. In theory, by using a one master/many slaves setup you can scale by adding more slaves until you either run out of network bandwidth, or your update load grows to the point that the master cannot handle it.
In order to determine how many slaves you can get before the added benefits begin to level out, and how much you can improve performance of your site, you need to know your query patterns, and empirically (by benchmarking) determine the relationship between the throughput on reads (reads per second, or max_reads) and on writes max_writes) on a typical master and a typical slave. The example below will show you a rather simplified calculation of what you can get with replication for our imagined system.
Let's say our system load consists of 10% writes and 90% reads, and we have determined that max_reads = 1200 - 2 * max_writes, or in other words, our system can do 1200 reads per second with no writes, our average write is twice as slow as average read, and the relationship is linear. Let us suppose that our master and slave are of the same capacity, and we have N slaves and 1 master. Then we have for each server (master or slave):
reads = 1200 - 2 * writes (from bencmarks)
reads = 9* writes / (N + 1) (reads split, but writes go to all servers)
9*writes/(N+1) + 2 * writes = 1200
writes = 1200/(2 + 9/(N+1)
So if N = 0, which means we have no replication, our system can handle 1200/11, about 109 writes per second (which means we will have 9 times as many reads due to the nature of our application).
If N = 1, we can get up to 184 writes per second.
If N = 8, we get up to 400.
If N = 17, 480 writes.
Eventually as N approaches infinity (and our budget negative infinity), we can get very close to 600 writes per second, increasing system throughput about 5.5 times. However, with only 8 servers, we increased it almost 4 times already.
Note that our computations assumed infinite network bandwidth, and neglected several other factors that could turn out to be signficant on your system. In many cases, you may not be able to make a computation similar to the one above that will accurately predict what will happen on your system if you add N replication slaves. However, answering the following questions should help you decided whether and how much, if at all, the replication will improve the performance of your system:
What is the read/write ratio on your system?
How much more write load can one server handle if you reduce the reads?
How many slaves do you have bandwidth for on your network?

How can I use replication to provide redundancy/high availability?
With the currently available features, you would have to set up a master and a slave (or several slaves), and write a script that will monitor the master to see if it is up, and instruct your applications and the slaves of the master change in case of failure. Some suggestions:
To tell a slave to change the master use the CHANGE MASTER TO command.
A good way to keep your applications informed where the master is by having a dynamic DNS entry for the master. With bind you can use nsupdate to dynamically update your DNS.
You should run your slaves with the log-bin option and without log-slave-updates. This way the slave will be ready to become a master as soon as you issue STOP SLAVE; RESET MASTER, and CHANGE MASTER TO on the other slaves. It will also help you catch spurious updates that may happen because of misconfiguration of the slave (ideally, you want to configure access rights so that no client can update the slave, except for the slave thread) combined with the bugs in your client programs (they should never update the slave directly).
We are currently working on intergrating an automatic master election system into MySQL, but until it is ready, you will have to create your own monitoring tools.

MySQL - Troubleshooting Replication
If you have followed the instructions, and your replication setup is not working, first elliminate the user error factor by checking the following:
Is the master logging to the binary log? Check with SHOW MASTER STATUS. If it is, Position will be non-zero. If not, verify that you have given the master log-bin option and have set server-id.
Is the slave running? Check with SHOW SLAVE STATUS. The answer is found in Slave_running column. If not, verify slave options and check the error log for messages.
If the slave is running, did it establish connection with the master? Do SHOW PROCESSLIST, find the thread with system user value in User column and none in the Host column, and check the State column. If it says connecting to master, verify the privileges for the replication user on the master, master host name, your DNS setup, whether the master is actually running, whether it is reachable from the slave, and if all that seems ok, read the error logs.
If the slave was running, but then stopped, look at SHOW SLAVE STATUS output andcheck the error logs. It usually happens when some query that succeeded on the master fails on the slave. This should never happen if you have taken a proper snapshot of the master, and never modify the data on the slave outside of the slave thread. If it does, it is a bug, read below on how to report it.
If a query on that succeeded on the master refuses to run on the slave, and a full database resync ( the proper thing to do ) does not seem feasible, try the following:
First see if there is some stray record in the way. Understand how it got there, then delete it and run SLAVE START If the above does not work or does not apply, try to understand if it would be safe to make the update manually ( if needed) and then ignore the next query from the master.
If you have decided you can skip the next query, do SET SQL_SLAVE_SKIP_COUNTER=1; SLAVE START; to skip a query that does not use auto_increment, last_insert_id or timestamp, or SET SQL_SLAVE_SKIP_COUNTER=2; SLAVE START; otherwise If you are sure the slave started out perfectly in sync with the master, and no one has updated the tables involved outside of slave thread, report the bug, so you will not have to do the above tricks again.
Make sure you are not running into an old bug by upgrading to the most recent version.
If all else fails, read the error logs. If they are big, grep -i slave /path/to/your-log.err on the slave. There is no generic pattern to search for on the master, as the only errors it logs are general system errors - if it can, it will send the error to the slave when things go wrong.
When you have determined that there is no user error involved, and replication still either does not work at all or is unstable, it is time to start working on a bug report. We need to get as much info as possible from you to be able to track down the bug. Please do spend some time and effort preparing a good bug report. Ideally, we would like to have a test case in the format found in mysql-test/t/rpl* directory of the source tree. If you submit a test case like that, you can expect a patch within a day or two in most cases, although, of course, you mileage may vary depending on a number of factors.
Second best option is a just program with easily configurable connection arguments for the master and the slave that will demonstrate the problem on our systems. You can write one in Perl or in C, depending on which language you know better.
If you have one of the above ways to demonstrate the bug, use mysqlbug to prepare a bug report and send it to bugs@lists.mysql.com. If you have a phantom - a problem that does occur but you cannot duplicate "at will":
Verify that there is no user error involved. For example, if you update the slave outside of the slave thread, the data will be out of sync, and you can have unique key violations on updates, in which case the slave thread will stop and wait for you to clean up the tables manually to bring them in sync.
Run slave with log-slave-updates and log-bin - this will keep a log of all updates on the slave.
Save all evidence before reseting the replication. If we have no or only sketchy information, it would take us a while to track down the problem. The evidence you should collect is:
All binary logs on the master
All binary log on the slave
The output of SHOW MASTER STATUS on the master at the time you have discovered the problem
The output of SHOW SLAVE STATUS on the master at the time you have discovered the problem
Error logs on the master and on the slave
Use mysqlbinlog to examine the binary logs. The following should be helpful to find the trouble query, for example:
mysqlbinlog -j pos_from_slave_status /path/to/log_from_slave_status | head
Once you have collected the evidence on the phantom problem, try hard to isolate it into a separate test case first. Then report the problem to bugs@lists.mysql.com with as much info as possible.

Getting Maximum Performance from MySQL
Optimization is a complicated task because it ultimately requires understanding of the whole system. While it may be possible to do some local optimizations with small knowledge of your system/application, the more optimal you want your system to become the more you will have to know about it.
So this chapter will try to explain and give some examples of different ways to optimize MySQL. But remember that there are always some (increasingly harder) additional ways to make the system even faster.

MySQL - Optimization Overview
The most important part for getting a system fast is of course the basic design. You also need to know what kinds of things your system will be doing, and what your bottlenecks are.
The most common bottlenecks are:
Disk seeks. It takes time for the disk to find a piece of data. With modern disks in 1999, the mean time for this is usually lower than 10ms, so we can in theory do about 1000 seeks a second. This time improves slowly with new disks and is very hard to optimize for a single table. The way to optimize this is to spread the data on more than one disk. Disk reading/writing. When the disk is at the correct position we need to read the data. With modern disks in 1999, one disk delivers something like 10-20Mb/s. This is easier to optimize than seeks because you can read in parallel from multiple disks.
CPU cycles. When we have the data in main memory (or if it already were there) we need to process it to get to our result. Having small tables compared to the memory is the most common limiting factor. But then, with small tables speed is usually not the problem.
Memory bandwidth. When the CPU needs more data than can fit in the CPU cache the main memory bandwidth becomes a bottleneck. This is an uncommon bottleneck for most systems, but one should be aware of it.

MySQL - System/Compile Time and Startup Parameter Tuning
We start with the system level things since some of these decisions have to be made very early. In other cases a fast look at this part may suffice because it not that important for the big gains. However, it is always nice to have a feeling about how much one could gain by changing things at this level.
The default OS to use is really important! To get the most use of multiple CPU machines one should use Solaris (because the threads works really nice) or Linux (because the 2.2 kernel has really good SMP support). Also on 32-bit machines Linux has a 2G file size limit by default. Hopefully this will be fixed soon when new filesystems are released (XFS/Reiserfs). If you have a desperate need for files bigger than 2G on Linux-Intel 32 bit, you should get the LFS patch for the ext2 file system.
Because we have not run MySQL in production on that many platforms, we advice you to test your intended platform before choosing it, if possible.
Other tips:
If you have enough RAM, you could remove all swap devices. Some operating systems will use a swap device in some contexts even if you have free memory.
Use the --skip-locking MySQL option to avoid external locking. Note that this will not impact MySQL's functionality as long as you only run one server. Just remember to take down the server (or lock relevant parts) before you run myisamchk. On some system this switch is mandatory because the external locking does not work in any case. The --skip-locking option is on by default when compiling with MIT-pthreads, because flock() isn't fully supported by MIT-pthreads on all platforms. It's also on default for Linux as Linux file locking are not yet safe. The only case when you can't use --skip-locking is if you run multiple MySQL servers (not clients) on the same data, or run myisamchk on the table without first flushing and locking the mysqld server tables first. You can still use LOCK TABLES/UNLOCK TABLES even if you are using --skip-locking 12.2.1 How Compiling and Linking Affects the Speed of MySQL
Most of the following tests are done on Linux with the MySQL benchmarks, but they should give some indication for other operating systems and workloads.
You get the fastest executable when you link with -static.
On Linux, you will get the fastest code when compiling with pgcc and -O6. To compile `sql_yacc.cc' with these options, you need about 200M memory because gcc/pgcc needs a lot of memory to make all functions inline. You should also set CXX=gcc when configuring MySQL to avoid inclusion of the libstdc++ library (it is not needed). Note that with some versions of pgcc, the resulting code will only run on true Pentium processors, even if you use the compiler option that you want the resulting code to be working on all x586 type processors (like AMD).
By just using a better compiler and/or better compiler options you can get a 10-30 % speed increase in your application. This is particularly important if you compile the SQL server yourself!
We have tested both the Cygnus CodeFusion and Fujitsu compilers, but when we tested them, neither was sufficiently bug free to allow MySQL to be compiled with optimizations on.
When you compile MySQL you should only include support for the character sets that you are going to use. (Option --with-charset=xxx). The standard MySQL binary distributions are compiled with support for all character sets.
Here is a list of some measurements that we have done:
If you use pgcc and compile everything with -O6, the mysqld server is 1% faster than with gcc 2.95.2. If you link dynamically (without -static), the result is 13% slower on Linux. Note that you still can use a dynamic linked MySQL library. It is only the server that is critical for performance.
If you connect using TCP/IP rather than Unix sockets, the result is 7.5% slower on the same computer. (If you are connection to localhost, MySQL will, by default, use sockets).
If you compile with --with-debug=full, then you will loose 20 % for most queries, but some queries may take substantially longer (The MySQL benchmarks ran 35 % slower) If you use --with-debug, then you will only loose 15 %. On a Sun SPARCstation 20, SunPro C++ 4.2 is 5 % faster than gcc 2.95.2.
Compiling with gcc 2.95.2 for ultrasparc with the option -mcpu=v8 -Wa,-xarch=v8plusa gives 4 % more performance. On Solaris 2.5.1, MIT-pthreads is 8-12% slower than Solaris native threads on a single processor. With more load/CPUs the difference should get bigger.
Running with --log-bin makes [MySQL 1 % slower.
Compiling without frame pointers -fomit-frame-pointer with gcc makes MySQL 1 % faster.
The MySQL-Linux distribution provided by MySQL AB used to be compiled with pgcc, but we had to go back to regular gcc because of a bug in pgcc that would generate the code that does not run on AMD. We will continue using gcc until that bug is resolved. In the meantime, if you have a non-AMD machine, you can get a faster binary by compiling with pgcc. The standard MySqL Linux binary is linked statically to get it faster and more portable.

MySQL - Disk Issues
As mentioned before, disks seeks are a big performance bottleneck. This problems gets more and more apparent when the data starts to grow so large that effective caching becomes impossible. For large databases, where you access data more or less randomly, you can be sure that you will need at least one disk seek to read and a couple of disk seeks to write things. To minimize this problem, use disks with low seek times.
Increase the number of available disk spindles (and thereby reduce the seek overhead) by either symlink files to different disks or striping the disks.
Using symbolic links
This means that you symlink the index and/or data file(s) from the normal data directory to another disk (that may also be striped). This makes both the seek and read times better (if the disks are not used for other things).
Striping
Striping means that you have many disks and put the first block on the first disk, the second block on the second disk, and the Nth on the (N mod number_of_disks) disk, and so on. This means if your normal data size is less than the stripe size (or perfectly aligned) you will get much better performance. Note that striping is very dependent on the OS and stripe-size. So benchmark your application with different stripe-sizes.
Note that the speed
difference for striping is very dependent on the parameters. Depending on how you set the striping parameters and number of disks you may get a difference in orders of magnitude. Note that you have to choose to optimize for random or sequential access.
For reliability you may want to use RAID 0+1 (striping + mirroring), but in this case you will need 2*N drives to hold N drives of data. This is probably the best option if you have the money for it! You may, however, also have to invest in some volume-management software to handle it efficiently.
A good option is to have semi-important data (that can be regenerated) on RAID 0 disk while storing really important data (like host information and logs) on a RAID 0+1 or RAID N disk. RAID N can be a problem if you have many writes because of the time to update the parity bits.
You may also set the parameters for the file system that the database uses. One easy change is to mount the file system with the noatime option. That makes it skip the updating of the last access time in the inode and by this will avoid some disk seeks.
On Linux, you can get much more performance (up to 100 % under load is not uncommon) by using hdpram to configure your disk's interface! The following should be quite good hdparm options for MySQL (and probably many other applications): hdparm -m 16 -d 1
Note that the performance/reliability when using the above depends on your hardware, so we strongly suggest that you test your system throughly after using hdparm! Please consult the hdparm man page for more information! If hdparm is not used wisely, filesystem corruption may result. Backup everything before experimenting!
On many operating systems you can mount the disks with the 'async' flag to set the file system to be updated asynchronously. If your computer is reasonable stable, this should give you more performance without sacrificing too much reliability. (This flag is on by default on Linux.)
If you don't need to know when a file was last accessed (which is not really useful on a database server), you can mount your file systems with the noatime flag.

How MySQL Opens and Closes Tables ?
table_cache, max_connections, and max_tmp_tables affect the maximum number of files the server keeps open. If you increase one or both of these values, you may run up against a limit imposed by your operating system on the per-process number of open file descriptors. However, you can increase the limit on many systems. Consult your OS documentation to find out how to do this, because the method for changing the limit varies widely from system to system.
table_cache is related to max_connections. For example, for 200 concurrent running connections, you should have a table cache of at least 200 * n, where n is the maximum number of tables in a join.
The cache of open tables can grow to a maximum of table_cache (default 64; this can be changed with the -O table_cache=# option to mysqld). A table is never closed, except when the cache is full and another thread tries to open a table or if you use mysqladmin refresh or mysqladmin flush-tables.
When the table cache fills up, the server uses the following procedure to locate a cache entry to use:
Tables that are not currently in use are released, in least-recently-used order.
If the cache is full and no tables can be released, but a new table needs to be opened, the cache is temporarily extended as necessary.
If the cache is in a temporarily-extended state and a table goes from in-use to not-in-use state, the table is closed and released from the cache.
A table is opened for each concurrent access. This means that if you have two threads accessing the same table or access the table twice in the same query (with AS) the table needs to be opened twice. The first open of any table takes two file descriptors; each additional use of the table takes only one file descriptor. The extra descriptor for the first open is used for the index file; this descriptor is shared among all threads.
You can check if your table cache is too small by checking the mysqld variable opened_tables. If this is quite big, even if you haven't done a lot of FLUSH TABLES, you should increase your table cache.

MySQL - Drawbacks to Creating Large Numbers of Tables in the Same Database
If you have many files in a directory, open, close, and create operations will be slow. If you execute SELECT statements on many different tables, there will be a little overhead when the table cache is full, because for every table that has to be opened, another must be closed. You can reduce this overhead by making the table cache larger.

MySQL - Why So Many Open tables?
When you run mysqladmin status, you'll see something like this:
Uptime: 426 Running threads: 1 Questions: 11082 Reloads: 1 Open tables: 12
This can be somewhat perplexing if you only have 6 tables.
MySQL is multithreaded, so it may have many queries on the same table simultaneously. To minimize the problem with two threads having different states on the same file, the table is opened independently by each concurrent thread. This takes some memory and one extra file descriptor for the data file. The index file descriptor is shared between all threads.

How MySQL Uses Memory ?
The list below indicates some of the ways that the mysqld server uses memory. Where applicable, the name of the server variable relevant to the memory use is given:
The key buffer (variable key_buffer_size) is shared by all threads; Other buffers used by the server are allocated as needed.
Each connection uses some thread-specific space: A stack (default 64K, variable thread_stack), a connection buffer (variable net_buffer_length), and a result buffer (variable net_buffer_length). The connection buffer and result buffer are dynamically enlarged up to max_allowed_packet when needed. When a query is running, a copy of the current query string is also allocated.
All threads share the same base memory.
Only the compressed ISAM / MyISAM tables are memory mapped. This is because the 32-bit memory space of 4GB is not large enough for most big tables. When systems with a 64-bit address space become more common we may add general support for memory mapping.
Each request doing a sequential scan over a table allocates a read buffer (variable record_buffer).
All joins are done in one pass, and most joins can be done without even using a temporary table. Most temporary tables are memory-based (HEAP) tables. Temporary tables with a big record length (calculated as the sum of all column lengths) or that contain BLOB columns are stored on disk. One problem in MySQL versions before Version 3.23.2 is that if a HEAP table exceeds the size of tmp_table_size, you get the error The table tbl_name is full. In newer versions this is handled by automatically changing the in-memory (HEAP) table to a disk-based (MyISAM) table as necessary. To work around this problem, you can increase the temporary table size by setting the tmp_table_size option to mysqld, or by setting the SQL option SQL_BIG_TABLES in the client program.
In MySQL Version 3.20, the maximum size of the temporary table was record_buffer*16, so if you are using this version, you have to increase the value of record_buffer. You can also start mysqld with the --big-tables option to always store temporary tables on disk. However, this will affect the speed of many complicated queries.
Most requests doing a sort allocates a sort buffer and 0-2 temporary files depending on the result set size.
Almost all parsing and calculating is done in a local memory store. No memory overhead is needed for small items and the normal slow memory allocation and freeing is avoided. Memory is allocated only for unexpectedly large strings (this is done with malloc() and free()).
Each index file is opened once and the data file is opened once for each concurrently running thread. For each concurrent thread, a table structure, column structures for each column, and a buffer of size 3 * n is allocated (where n is the maximum row length, not counting BLOB columns). A BLOB uses 5 to 8 bytes plus the length of the BLOB data. The ISAM/MyISAM table handlers will use one extra row buffer for internal usage.
For each table having BLOB columns, a buffer is enlarged dynamically to read in larger BLOB values. If you scan a table, a buffer as large as the largest BLOB value is allocated.
Table handlers for all in-use tables are saved in a cache and managed as a FIFO. Normally the cache has 64 entries. If a table has been used by two running threads at the same time, the cache contains two entries for the table.
A mysqladmin flush-tables command closes all tables that are not in use and marks all in-use tables to be closed when the currently executing thread finishes. This will effectively free most in-use memory. ps and other system status programs may report that mysqld uses a lot of memory. This may be caused by thread-stacks on different memory addresses. For example, the Solaris version of ps counts the unused memory between stacks as used memory. You can verify this by checking available swap with swap -s. We have tested mysqld with commercial memory-leakage detectors, so there should be no memory leaks.

How MySQL Locks Tables ?
You can find a discussion about different locking methods in the appendix.
All locking in MySQL is deadlock-free. This is managed by always requesting all needed locks at once at the beginning of a query and always locking the tables in the same order.
The locking method MySQL uses for WRITE locks works as follows:
If there are no locks on the table, put a write lock on it. Otherwise, put the lock request in the write lock queue. The locking method MySQL uses for READ locks works as follows:
If there are no write locks on the table, put a read lock on it. Otherwise, put the lock request in the read lock queue. When a lock is released, the lock is made available to the threads in the write lock queue, then to the threads in the read lock queue.
This means that if you have many updates on a table, SELECT statements will wait until there are no more updates.
To work around this for the case where you want to do many INSERT and SELECT operations on a table, you can insert rows in a temporary table and update the real table with the records from the temporary table once in a while.
This can be done with the following code:
mysql> LOCK TABLES real_table WRITE, insert_table WRITE;
mysql> insert into real_table select * from insert_table;
mysql> TRUNCATE TABLE insert_table;
mysql> UNLOCK TABLES;
You can use the LOW_PRIORITY options with INSERT if you want to prioritize retrieval in some specific cases.
You could also change the locking code in `mysys/thr_lock.c' to use a single queue. In this case, write locks and read locks would have the same priority, which might help some applications.

MySQL - Table Locking Issues
The table locking code in MySQL is deadlock free.
MySQL uses table locking (instead of row locking or column locking) on all table types, except BDB tables, to achieve a very high lock speed. For large tables, table locking is MUCH better than row locking for most applications, but there are, of course, some pitfalls.
For BDB tables, MySQL only uses table locking if you explicitely lock the table with LOCK TABLES or execute a command that will modify every row in the table, like ALTER TABLE.
In MySQL Version 3.23.7 and above, you can insert rows into MyISAM tables at the same time other threads are reading from the table. Note that currently this only works if there are no holes after deleted rows in the table at the time the insert is made.
Table locking enables many threads to read from a table at the same time, but if a thread wants to write to a table, it must first get exclusive access. During the update, all other threads that want to access this particular table will wait until the update is ready.
As updates on tables normally are considered to be more important than SELECT, all statements that update a table have higher priority than statements that retrieve information from a table. This should ensure that updates are not 'starved' because one issues a lot of heavy queries against a specific table. (You can change this by using LOW_PRIORITY with the statement that does the update or HIGH_PRIORITY with the SELECT statement.)
Starting from MySQL Version 3.23.7 one can use the max_write_lock_count variable to force MySQL to temporary give all SELECT statements, that wait for a table, a higher priority after a specific number of inserts on a table.
Table locking is, however, not very good under the following senario:
A client issues a SELECT that takes a long time to run.
Another client then issues an UPDATE on a used table. This client will wait until the SELECT is finished.
Another client issues another SELECT statement on the same table. As UPDATE has higher priority than SELECT, this SELECT will wait for the UPDATE to finish. It will also wait for the first SELECT to finish!
A thread is waiting for something like full disk, in which case all threads that wants to access the problem table will also be put in a waiting state until more disk space is made available.
Some possible solutions to this problem are:
Try to get the SELECT statements to run faster. You may have to create some summary tables to do this.
Start mysqld with --low-priority-updates. This will give all statements that update (modify) a table lower priority than a SELECT statement. In this case the last SELECT statement in the previous scenario would execute before the INSERT statement. You can give a specific INSERT, UPDATE, or DELETE statement lower priority with the LOW_PRIORITY attribute.
Start mysqld with a low value for max_write_lock_count to give READ locks after a certain number of WRITE locks.
You can specify that all updates from a specific thread should be done with low priority by using the SQL command: SET SQL_LOW_PRIORITY_UPDATES=1.
You can specify that a specific SELECT is very important with the HIGH_PRIORITY attribute.
If you have problems with INSERT combined with SELECT, switch to use the new MyISAM tables as these support concurrent SELECTs and INSERTs.
If you mainly mix INSERT and SELECT statements, the DELAYED attribute to INSERT will probably solve your problems.
If you have problems with SELECT and DELETE, the LIMIT option to DELETE may help.

How MySQL uses DNS ?
When a new threads connects to mysqld, mysqld will span a new thread to handle the request. This thread will first check if the hostname is in the hostname cache. If not the thread will call gethostbyaddr_r() and gethostbyname_r() to resolve the hostname.
If the operating system doesn't support the above thread-safe calls, the thread will lock a mutex and call gethostbyaddr() and gethostbyname() instead. Note that in this case no other thread can resolve other hostnames that is not in the hostname cache until the first thread is ready.
You can disable DNS host lookup by starting mysqld with --skip-name-resolve. In this case you can however only use IP names in the MySQL privilege tables.
If you have a very slow DNS and many hosts, you can get more performance by either disabling DNS lookop with --skip-name-resolve or by increasing the HOST_CACHE_SIZE define (default: 128) and recompile mysqld.
You can disable the hostname cache with --skip-host-cache. You can clear the hostname cache with FLUSH HOSTS or mysqladmin flush-hosts.
If you don't want to allow connections over TCP/IP, you can do this by starting mysqld with --skip-networking.

MySQL - Get Your Data as Small as Possible
One of the most basic optimization is to get your data (and indexes) to take as little space on the disk (and in memory) as possible. This can give huge improvements because disk reads are faster and normally less main memory will be used. Indexing also takes less resources if done on smaller columns.
MySQL supports a lot of different table types and row formats. Choosing the right table format may give you a big performance gain.
You can get better performance on a table and minimize storage space using the techniques listed below:
Use the most efficient (smallest) types possible. MySQL has many specialized types that save disk space and memory.
Use the smaller integer types if possible to get smaller tables. For example, MEDIUMINT is often better than INT.
Declare columns to be NOT NULL if possible. It makes everything faster and you save one bit per column. Note that if you really need NULL in your application you should definitely use it. Just avoid having it on all columns by default.
If you don't have any variable-length columns (VARCHAR, TEXT, or BLOB columns), a fixed-size record format is used. This is faster but unfortunately may waste some space.
The primary index of a table should be as short as possible. This makes identification of one row easy and efficient. For each table, you have to decide which storage/index method to use.
Only create the indexes that you really need. Indexes are good for retrieval but bad when you need to store things fast. If you mostly access a table by searching on a combination of columns, make an index on them. The first index part should be the most used column. If you are ALWAYS using many columns, you should use the column with more duplicates first to get better compression of the index.
If it's very likely that a column has a unique prefix on the first number of characters, it's better to only index this prefix. MySQL supports an index on a part of a character column. Shorter indexes are faster not only because they take less disk space but also because they will give you more hits in the index cache and thus fewer disk seeks.
In some circumstances it can be beneficial to split into two a table that is scanned very often. This is especially true if it is a dynamic format table and it is possible to use a smaller static format table that can be used to find the relevant rows when scanning the table.

MySQL Interview Questions and Answers

1 Comments

Contact Form