Operation is not valid due to the current state of the object.

baidu_15160663 2017-02-20 02:46:25

KeywordQuery keywordQuery = new KeywordQuery(site);
这个一直报Operation is not valid due to the current state of the object.
同样的代码在控制台应用程序却没有问题，大家帮忙分析分析感激不尽



                Microsoft.SharePoint.SPSecurity.RunWithElevatedPrivileges(delegate ()

                {

                    using (SPSite thisSite = new SPSite("http://localhost:9090/"))

                    {

                        var thisWeb = thisSite.RootWeb;

                        thisWeb.AllowUnsafeUpdates = true;

                        SPUser user = thisWeb.EnsureUser("mossadmin");

                        string weburl = "http://localhost:9090/";

                        using (SPSite site = new SPSite(weburl, user.UserToken))

                        {

                            KeywordQuery keywordQuery = new KeywordQuery(site);

                            keywordQuery.QueryText = "huanghailong";

                            SearchExecutor searchExecutor = new SearchExecutor();

                            ResultTableCollection resultTableCollection = searchExecutor.ExecuteQuery(keywordQuery);

                            var resultTables = resultTableCollection.Filter("TableType", KnownTableTypes.RelevantResults);

                            var resultTable = resultTables.FirstOrDefault();

                            DataTable dataTable = resultTable.Table;

                            Console.WriteLine(dataTable.Rows.Count);

                        }

                    }

                });

...全文

2129 4 打赏收藏转发到动态举报

写回复

4 条回复

切换为时间正序

请发表友善的回复…

发表回复

段传涛 2017-04-01

打赏
举报

回复

从代码，没有看出错误来。调试一下吧，看看抛出什么错误。

Frank_CAU 2017-03-21

打赏
举报

回复

你的这个有问题的程序是在w3wp中运行的么？看你的这个code好像是Console的程序的code，可否把有问题的code复制出来，另外请问一下你的这个页面后台的code是在什么中运行？是在SharePoint的环境中运行的还是另外一个和SharePoint无关的ASP.Net站点？

霖雨版主 2017-02-21

打赏
举报

回复

var thisWeb = thisSite.RootWeb; 改一下呢 using(spweb thisWeb=thisSite.openweb(thisSite.RootWeb.Id)) 建议调试一下，看看哪句报错了。。如果是webservice，可以try。。。catch一下，把错误日志记录下来

Justin-Liu 2017-02-20

打赏
举报

回复

在控制台程序没问题的话，你这个有问题的是在什么类型的project下？

Overview Package Class Tree Deprecated Index Help PREV NEXT FRAMES NO FRAMES A B C D E F G H I J L P R S U V -------------------------------------------------------------------------------- A addCookie(Cookie) - Method in class javax.servlet.http.HttpServletResponseWrapper The default behavior of this method is to call addCookie(Cookie cookie) on the wrapped response object. addCookie(Cookie) - Method in interface javax.servlet.http.HttpServletResponse Adds the specified cookie to the response. addDateHeader(String, long) - Method in class javax.servlet.http.HttpServletResponseWrapper The default behavior of this method is to call addDateHeader(String name, long date) on the wrapped response object. addDateHeader(String, long) - Method in interface javax.servlet.http.HttpServletResponse Adds a response header with the given name and date-value. addHeader(String, String) - Method in class javax.servlet.http.HttpServletResponseWrapper The default behavior of this method is to return addHeader(String name, String value) on the wrapped response object. addHeader(String, String) - Method in interface javax.servlet.http.HttpServletResponse Adds a response header with the given name and value. addIntHeader(String, int) - Method in class javax.servlet.http.HttpServletResponseWrapper The default behavior of this method is to call addIntHeader(String name, int value) on the wrapped response object. addIntHeader(String, int) - Method in interface javax.servlet.http.HttpServletResponse Adds a response header with the given name and integer value. attributeAdded(HttpSessionBindingEvent) - Method in interface javax.servlet.http.HttpSessionAttributeListener Notification that an attribute has been added to a session. attributeAdded(ServletContextAttributeEvent) - Method in interface javax.servlet.ServletContextAttributeListener Notification that a new attribute was added to the servlet context. attributeAdded(ServletRequestAttributeEvent) - Method in interface javax.servlet.ServletRequestAttributeListener Notification that a new attribute was added to the servlet request. attributeRemoved(HttpSessionBindingEvent) - Method in interface javax.servlet.http.HttpSessionAttributeListener Notification that an attribute has been removed from a session. attributeRemoved(ServletContextAttributeEvent) - Method in interface javax.servlet.ServletContextAttributeListener Notification that an existing attribute has been removed from the servlet context. attributeRemoved(ServletRequestAttributeEvent) - Method in interface javax.servlet.ServletRequestAttributeListener Notification that a new attribute was removed from the servlet request. attributeReplaced(HttpSessionBindingEvent) - Method in interface javax.servlet.http.HttpSessionAttributeListener Notification that an attribute has been replaced in a session. attributeReplaced(ServletContextAttributeEvent) - Method in interface javax.servlet.ServletContextAttributeListener Notification that an attribute on the servlet context has been replaced. attributeReplaced(ServletRequestAttributeEvent) - Method in interface javax.servlet.ServletRequestAttributeListener Notification that an attribute was replaced on the servlet request. -------------------------------------------------------------------------------- B BASIC_AUTH - Static variable in interface javax.servlet.http.HttpServletRequest String identifier for Basic authentication. -------------------------------------------------------------------------------- C CLIENT_CERT_AUTH - Static variable in interface javax.servlet.http.HttpServletRequest String identifier for Client Certificate authentication. clone() - Method in class javax.servlet.http.Cookie Overrides the standard java.lang.Object.clone method to return a copy of this cookie. containsHeader(String) - Method in class javax.servlet.http.HttpServletResponseWrapper The default behavior of this method is to call containsHeader(String name) on the wrapped response object. containsHeader(String) - Method in interface javax.servlet.http.HttpServletResponse Returns a boolean indicating whether the named response header has already been set. contextDestroyed(ServletContextEvent) - Method in interface javax.servlet.ServletContextListener Notification that the servlet context is about to be shut down. contextInitialized(ServletContextEvent) - Method in interface javax.servlet.ServletContextListener Notification that the web application initialization process is starting. Cookie - class javax.servlet.http.Cookie. Creates a cookie, a small amount of information sent by a servlet to a Web browser, saved by the browser, and later sent back to the server. Cookie(String, String) - Constructor for class javax.servlet.http.Cookie Constructs a cookie with a specified name and value. -------------------------------------------------------------------------------- D destroy() - Method in interface javax.servlet.Filter Called by the web container to indicate to a filter that it is being taken out of service. destroy() - Method in interface javax.servlet.Servlet Called by the servlet container to indicate to a servlet that the servlet is being taken out of service. destroy() - Method in class javax.servlet.GenericServlet Called by the servlet container to indicate to a servlet that the servlet is being taken out of service. DIGEST_AUTH - Static variable in interface javax.servlet.http.HttpServletRequest String identifier for Digest authentication. doDelete(HttpServletRequest, HttpServletResponse) - Method in class javax.servlet.http.HttpServlet Called by the server (via the service method) to allow a servlet to handle a DELETE request. doFilter(ServletRequest, ServletResponse) - Method in interface javax.servlet.FilterChain Causes the next filter in the chain to be invoked, or if the calling filter is the last filter in the chain, causes the resource at the end of the chain to be invoked. doFilter(ServletRequest, ServletResponse, FilterChain) - Method in interface javax.servlet.Filter The doFilter method of the Filter is called by the container each time a request/response pair is passed through the chain due to a client request for a resource at the end of the chain. doGet(HttpServletRequest, HttpServletResponse) - Method in class javax.servlet.http.HttpServlet Called by the server (via the service method) to allow a servlet to handle a GET request. doHead(HttpServletRequest, HttpServletResponse) - Method in class javax.servlet.http.HttpServlet Receives an HTTP HEAD request from the protected service method and handles the request. doOptions(HttpServletRequest, HttpServletResponse) - Method in class javax.servlet.http.HttpServlet Called by the server (via the service method) to allow a servlet to handle a OPTIONS request. doPost(HttpServletRequest, HttpServletResponse) - Method in class javax.servlet.http.HttpServlet Called by the server (via the service method) to allow a servlet to handle a POST request. doPut(HttpServletRequest, HttpServletResponse) - Method in class javax.servlet.http.HttpServlet Called by the server (via the service method) to allow a servlet to handle a PUT request. doTrace(HttpServletRequest, HttpServletResponse) - Method in class javax.servlet.http.HttpServlet Called by the server (via the service method) to allow a servlet to handle a TRACE request. -------------------------------------------------------------------------------- E encodeRedirectUrl(String) - Method in class javax.servlet.http.HttpServletResponseWrapper The default behavior of this method is to return encodeRedirectUrl(String url) on the wrapped response object. encodeRedirectUrl(String) - Method in interface javax.servlet.http.HttpServletResponse Deprecated. As of version 2.1, use encodeRedirectURL(String url) instead encodeRedirectURL(String) - Method in class javax.servlet.http.HttpServletResponseWrapper The default behavior of this method is to return encodeRedirectURL(String url) on the wrapped response object. encodeRedirectURL(String) - Method in interface javax.servlet.http.HttpServletResponse Encodes the specified URL for use in the sendRedirect method or, if encoding is not needed, returns the URL unchanged. encodeUrl(String) - Method in class javax.servlet.http.HttpServletResponseWrapper The default behavior of this method is to call encodeUrl(String url) on the wrapped response object. encodeUrl(String) - Method in interface javax.servlet.http.HttpServletResponse Deprecated. As of version 2.1, use encodeURL(String url) instead encodeURL(String) - Method in class javax.servlet.http.HttpServletResponseWrapper The default behavior of this method is to call encodeURL(String url) on the wrapped response object. encodeURL(String) - Method in interface javax.servlet.http.HttpServletResponse Encodes the specified URL by including the session ID in it, or, if encoding is not needed, returns the URL unchanged. -------------------------------------------------------------------------------- F Filter - interface javax.servlet.Filter. A filter is an object that performs filtering tasks on either the request to a resource (a servlet or static content), or on the response from a resource, or both. Filters perform filtering in the doFilter method. FilterChain - interface javax.servlet.FilterChain. A FilterChain is an object provided by the servlet container to the developer giving a view into the invocation chain of a filtered request for a resource. FilterConfig - interface javax.servlet.FilterConfig. A filter configuration object used by a servlet container to pass information to a filter during initialization. flushBuffer() - Method in interface javax.servlet.ServletResponse Forces any content in the buffer to be written to the client. flushBuffer() - Method in class javax.servlet.ServletResponseWrapper The default behavior of this method is to call flushBuffer() on the wrapped response object. FORM_AUTH - Static variable in interface javax.servlet.http.HttpServletRequest String identifier for Form authentication. forward(ServletRequest, ServletResponse) - Method in interface javax.servlet.RequestDispatcher Forwards a request from a servlet to another resource (servlet, JSP file, or HTML file) on the server. -------------------------------------------------------------------------------- G GenericServlet - class javax.servlet.GenericServlet. Defines a generic, protocol-independent servlet. GenericServlet() - Constructor for class javax.servlet.GenericServlet Does nothing. getAttribute(String) - Method in interface javax.servlet.ServletContext Returns the servlet container attribute with the given name, or null if there is no attribute by that name. getAttribute(String) - Method in class javax.servlet.ServletRequestWrapper The default behavior of this method is to call getAttribute(String name) on the wrapped request object. getAttribute(String) - Method in interface javax.servlet.ServletRequest Returns the value of the named attribute as an Object, or null if no attribute of the given name exists. getAttribute(String) - Method in interface javax.servlet.http.HttpSession Returns the object bound with the specified name in this session, or null if no object is bound under the name. getAttributeNames() - Method in interface javax.servlet.ServletContext Returns an Enumeration containing the attribute names available within this servlet context. getAttributeNames() - Method in class javax.servlet.ServletRequestWrapper The default behavior of this method is to return getAttributeNames() on the wrapped request object. getAttributeNames() - Method in interface javax.servlet.ServletRequest Returns an Enumeration containing the names of the attributes available to this request. getAttributeNames() - Method in interface javax.servlet.http.HttpSession Returns an Enumeration of String objects containing the names of all the objects bound to this session. getAuthType() - Method in interface javax.servlet.http.HttpServletRequest Returns the name of the authentication scheme used to protect the servlet. getAuthType() - Method in class javax.servlet.http.HttpServletRequestWrapper The default behavior of this method is to return getAuthType() on the wrapped request object. getBufferSize() - Method in interface javax.servlet.ServletResponse Returns the actual buffer size used for the response. getBufferSize() - Method in class javax.servlet.ServletResponseWrapper The default behavior of this method is to return getBufferSize() on the wrapped response object. getCharacterEncoding() - Method in interface javax.servlet.ServletResponse Returns the name of the character encoding (MIME charset) used for the body sent in this response. getCharacterEncoding() - Method in class javax.servlet.ServletRequestWrapper The default behavior of this method is to return getCharacterEncoding() on the wrapped request object. getCharacterEncoding() - Method in interface javax.servlet.ServletRequest Returns the name of the character encoding used in the body of this request. getCharacterEncoding() - Method in class javax.servlet.ServletResponseWrapper The default behavior of this method is to return getCharacterEncoding() on the wrapped response object. getComment() - Method in class javax.servlet.http.Cookie Returns the comment describing the purpose of this cookie, or null if the cookie has no comment. getContentLength() - Method in class javax.servlet.ServletRequestWrapper The default behavior of this method is to return getContentLength() on the wrapped request object. getContentLength() - Method in interface javax.servlet.ServletRequest Returns the length, in bytes, of the request body and made available by the input stream, or -1 if the length is not known. getContentType() - Method in interface javax.servlet.ServletResponse Returns the content type used for the MIME body sent in this response. getContentType() - Method in class javax.servlet.ServletRequestWrapper The default behavior of this method is to return getContentType() on the wrapped request object. getContentType() - Method in interface javax.servlet.ServletRequest Returns the MIME type of the body of the request, or null if the type is not known. getContentType() - Method in class javax.servlet.ServletResponseWrapper The default behavior of this method is to return getContentType() on the wrapped response object. getContext(String) - Method in interface javax.servlet.ServletContext Returns a ServletContext object that corresponds to a specified URL on the server. getContextPath() - Method in interface javax.servlet.http.HttpServletRequest Returns the portion of the request URI that indicates the context of the request. getContextPath() - Method in class javax.servlet.http.HttpServletRequestWrapper The default behavior of this method is to return getContextPath() on the wrapped request object. getCookies() - Method in interface javax.servlet.http.HttpServletRequest Returns an array containing all of the Cookie objects the client sent with this request. getCookies() - Method in class javax.servlet.http.HttpServletRequestWrapper The default behavior of this method is to return getCookies() on the wrapped request object. getCreationTime() - Method in interface javax.servlet.http.HttpSession Returns the time when this session was created, measured in milliseconds since midnight January 1, 1970 GMT. getDateHeader(String) - Method in interface javax.servlet.http.HttpServletRequest Returns the value of the specified request header as a long value that represents a Date object. getDateHeader(String) - Method in class javax.servlet.http.HttpServletRequestWrapper The default behavior of this method is to return getDateHeader(String name) on the wrapped request object. getDomain() - Method in class javax.servlet.http.Cookie Returns the domain name set for this cookie. getFilterName() - Method in interface javax.servlet.FilterConfig Returns the filter-name of this filter as defined in the deployment descriptor. getHeader(String) - Method in interface javax.servlet.http.HttpServletRequest Returns the value of the specified request header as a String. getHeader(String) - Method in class javax.servlet.http.HttpServletRequestWrapper The default behavior of this method is to return getHeader(String name) on the wrapped request object. getHeaderNames() - Method in interface javax.servlet.http.HttpServletRequest Returns an enumeration of all the header names this request contains. getHeaderNames() - Method in class javax.servlet.http.HttpServletRequestWrapper The default behavior of this method is to return getHeaderNames() on the wrapped request object. getHeaders(String) - Method in interface javax.servlet.http.HttpServletRequest Returns all the values of the specified request header as an Enumeration of String objects. getHeaders(String) - Method in class javax.servlet.http.HttpServletRequestWrapper The default behavior of this method is to return getHeaders(String name) on the wrapped request object. getId() - Method in interface javax.servlet.http.HttpSession Returns a string containing the unique identifier assigned to this session. getIds() - Method in interface javax.servlet.http.HttpSessionContext Deprecated. As of Java Servlet API 2.1 with no replacement. This method must return an empty Enumeration and will be removed in a future version of this API. getInitParameter(String) - Method in interface javax.servlet.FilterConfig Returns a String containing the value of the named initialization parameter, or null if the parameter does not exist. getInitParameter(String) - Method in interface javax.servlet.ServletConfig Returns a String containing the value of the named initialization parameter, or null if the parameter does not exist. getInitParameter(String) - Method in interface javax.servlet.ServletContext Returns a String containing the value of the named context-wide initialization parameter, or null if the parameter does not exist. getInitParameter(String) - Method in class javax.servlet.GenericServlet Returns a String containing the value of the named initialization parameter, or null if the parameter does not exist. getInitParameterNames() - Method in interface javax.servlet.FilterConfig Returns the names of the filter's initialization parameters as an Enumeration of String objects, or an empty Enumeration if the filter has no initialization parameters. getInitParameterNames() - Method in interface javax.servlet.ServletConfig Returns the names of the servlet's initialization parameters as an Enumeration of String objects, or an empty Enumeration if the servlet has no initialization parameters. getInitParameterNames() - Method in interface javax.servlet.ServletContext Returns the names of the context's initialization parameters as an Enumeration of String objects, or an empty Enumeration if the context has no initialization parameters. getInitParameterNames() - Method in class javax.servlet.GenericServlet Returns the names of the servlet's initialization parameters as an Enumeration of String objects, or an empty Enumeration if the servlet has no initialization parameters. getInputStream() - Method in class javax.servlet.ServletRequestWrapper The default behavior of this method is to return getInputStream() on the wrapped request object. getInputStream() - Method in interface javax.servlet.ServletRequest Retrieves the body of the request as binary data using a ServletInputStream. getIntHeader(String) - Method in interface javax.servlet.http.HttpServletRequest Returns the value of the specified request header as an int. getIntHeader(String) - Method in class javax.servlet.http.HttpServletRequestWrapper The default behavior of this method is to return getIntHeader(String name) on the wrapped request object. getLastAccessedTime() - Method in interface javax.servlet.http.HttpSession Returns the last time the client sent a request associated with this session, as the number of milliseconds since midnight January 1, 1970 GMT, and marked by the time the container received the request. getLastModified(HttpServletRequest) - Method in class javax.servlet.http.HttpServlet Returns the time the HttpServletRequest object was last modified, in milliseconds since midnight January 1, 1970 GMT. getLocalAddr() - Method in class javax.servlet.ServletRequestWrapper The default behavior of this method is to return getLocalAddr() on the wrapped request object. getLocalAddr() - Method in interface javax.servlet.ServletRequest Returns the Internet Protocol (IP) address of the interface on which the request was received. getLocale() - Method in interface javax.servlet.ServletResponse Returns the locale specified for this response using the ServletResponse.setLocale(java.util.Locale) method. getLocale() - Method in class javax.servlet.ServletRequestWrapper The default behavior of this method is to return getLocale() on the wrapped request object. getLocale() - Method in interface javax.servlet.ServletRequest Returns the preferred Locale that the client will accept content in, based on the Accept-Language header. getLocale() - Method in class javax.servlet.ServletResponseWrapper The default behavior of this method is to return getLocale() on the wrapped response object. getLocales() - Method in class javax.servlet.ServletRequestWrapper The default behavior of this method is to return getLocales() on the wrapped request object. getLocales() - Method in interface javax.servlet.ServletRequest Returns an Enumeration of Locale objects indicating, in decreasing order starting with the preferred locale, the locales that are acceptable to the client based on the Accept-Language header. getLocalName() - Method in class javax.servlet.ServletRequestWrapper The default behavior of this method is to return getLocalName() on the wrapped request object. getLocalName() - Method in interface javax.servlet.ServletRequest Returns the host name of the Internet Protocol (IP) interface on which the request was received. getLocalPort() - Method in class javax.servlet.ServletRequestWrapper The default behavior of this method is to return getLocalPort() on the wrapped request object. getLocalPort() - Method in interface javax.servlet.ServletRequest Returns the Internet Protocol (IP) port number of the interface on which the request was received. getMajorVersion() - Method in interface javax.servlet.ServletContext Returns the major version of the Java Servlet API that this servlet container supports. getMaxAge() - Method in class javax.servlet.http.Cookie Returns the maximum age of the cookie, specified in seconds, By default, -1 indicating the cookie will persist until browser shutdown. getMaxInactiveInterval() - Method in interface javax.servlet.http.HttpSession Returns the maximum time interval, in seconds, that the servlet container will keep this session open between client accesses. getMethod() - Method in interface javax.servlet.http.HttpServletRequest Returns the name of the HTTP method with which this request was made, for example, GET, POST, or PUT. getMethod() - Method in class javax.servlet.http.HttpServletRequestWrapper The default behavior of this method is to return getMethod() on the wrapped request object. getMimeType(String) - Method in interface javax.servlet.ServletContext Returns the MIME type of the specified file, or null if the MIME type is not known. getMinorVersion() - Method in interface javax.servlet.ServletContext Returns the minor version of the Servlet API that this servlet container supports. getName() - Method in class javax.servlet.ServletContextAttributeEvent Return the name of the attribute that changed on the ServletContext. getName() - Method in class javax.servlet.ServletRequestAttributeEvent Return the name of the attribute that changed on the ServletRequest getName() - Method in class javax.servlet.http.HttpSessionBindingEvent Returns the name with which the attribute is bound to or unbound from the session. getName() - Method in class javax.servlet.http.Cookie Returns the name of the cookie. getNamedDispatcher(String) - Method in interface javax.servlet.ServletContext Returns a RequestDispatcher object that acts as a wrapper for the named servlet. getOutputStream() - Method in interface javax.servlet.ServletResponse Returns a ServletOutputStream suitable for writing binary data in the response. getOutputStream() - Method in class javax.servlet.ServletResponseWrapper The default behavior of this method is to return getOutputStream() on the wrapped response object. getParameter(String) - Method in class javax.servlet.ServletRequestWrapper The default behavior of this method is to return getParameter(String name) on the wrapped request object. getParameter(String) - Method in interface javax.servlet.ServletRequest Returns the value of a request parameter as a String, or null if the parameter does not exist. getParameterMap() - Method in class javax.servlet.ServletRequestWrapper The default behavior of this method is to return getParameterMap() on the wrapped request object. getParameterMap() - Method in interface javax.servlet.ServletRequest Returns a java.util.Map of the parameters of this request. getParameterNames() - Method in class javax.servlet.ServletRequestWrapper The default behavior of this method is to return getParameterNames() on the wrapped request object. getParameterNames() - Method in interface javax.servlet.ServletRequest Returns an Enumeration of String objects containing the names of the parameters contained in this request. getParameterValues(String) - Method in class javax.servlet.ServletRequestWrapper The default behavior of this method is to return getParameterValues(String name) on the wrapped request object. getParameterValues(String) - Method in interface javax.servlet.ServletRequest Returns an array of String objects containing all of the values the given request parameter has, or null if the parameter does not exist. getPath() - Method in class javax.servlet.http.Cookie Returns the path on the server to which the browser returns this cookie. getPathInfo() - Method in interface javax.servlet.http.HttpServletRequest Returns any extra path information associated with the URL the client sent when it made this request. getPathInfo() - Method in class javax.servlet.http.HttpServletRequestWrapper The default behavior of this method is to return getPathInfo() on the wrapped request object. getPathTranslated() - Method in interface javax.servlet.http.HttpServletRequest Returns any extra path information after the servlet name but before the query string, and translates it to a real path. getPathTranslated() - Method in class javax.servlet.http.HttpServletRequestWrapper The default behavior of this method is to return getPathTranslated() on the wrapped request object. getProtocol() - Method in class javax.servlet.ServletRequestWrapper The default behavior of this method is to return getProtocol() on the wrapped request object. getProtocol() - Method in interface javax.servlet.ServletRequest Returns the name and version of the protocol the request uses in the form protocol/majorVersion.minorVersion, for example, HTTP/1.1. getQueryString() - Method in interface javax.servlet.http.HttpServletRequest Returns the query string that is contained in the request URL after the path. getQueryString() - Method in class javax.servlet.http.HttpServletRequestWrapper The default behavior of this method is to return getQueryString() on the wrapped request object. getReader() - Method in class javax.servlet.ServletRequestWrapper The default behavior of this method is to return getReader() on the wrapped request object. getReader() - Method in interface javax.servlet.ServletRequest Retrieves the body of the request as character data using a BufferedReader. getRealPath(String) - Method in interface javax.servlet.ServletContext Returns a String containing the real path for a given virtual path. getRealPath(String) - Method in class javax.servlet.ServletRequestWrapper The default behavior of this method is to return getRealPath(String path) on the wrapped request object. getRealPath(String) - Method in interface javax.servlet.ServletRequest Deprecated. As of Version 2.1 of the Java Servlet API, use ServletContext.getRealPath(java.lang.String) instead. getRemoteAddr() - Method in class javax.servlet.ServletRequestWrapper The default behavior of this method is to return getRemoteAddr() on the wrapped request object. getRemoteAddr() - Method in interface javax.servlet.ServletRequest Returns the Internet Protocol (IP) address of the client or last proxy that sent the request. getRemoteHost() - Method in class javax.servlet.ServletRequestWrapper The default behavior of this method is to return getRemoteHost() on the wrapped request object. getRemoteHost() - Method in interface javax.servlet.ServletRequest Returns the fully qualified name of the client or the last proxy that sent the request. getRemotePort() - Method in class javax.servlet.ServletRequestWrapper The default behavior of this method is to return getRemotePort() on the wrapped request object. getRemotePort() - Method in interface javax.servlet.ServletRequest Returns the Internet Protocol (IP) source port of the client or last proxy that sent the request. getRemoteUser() - Method in interface javax.servlet.http.HttpServletRequest Returns the login of the user making this request, if the user has been authenticated, or null if the user has not been authenticated. getRemoteUser() - Method in class javax.servlet.http.HttpServletRequestWrapper The default behavior of this method is to return getRemoteUser() on the wrapped request object. getRequest() - Method in class javax.servlet.ServletRequestWrapper Return the wrapped request object. getRequestDispatcher(String) - Method in interface javax.servlet.ServletContext Returns a RequestDispatcher object that acts as a wrapper for the resource located at the given path. getRequestDispatcher(String) - Method in class javax.servlet.ServletRequestWrapper The default behavior of this method is to return getRequestDispatcher(String path) on the wrapped request object. getRequestDispatcher(String) - Method in interface javax.servlet.ServletRequest Returns a RequestDispatcher object that acts as a wrapper for the resource located at the given path. getRequestedSessionId() - Method in interface javax.servlet.http.HttpServletRequest Returns the session ID specified by the client. getRequestedSessionId() - Method in class javax.servlet.http.HttpServletRequestWrapper The default behavior of this method is to return getRequestedSessionId() on the wrapped request object. getRequestURI() - Method in interface javax.servlet.http.HttpServletRequest Returns the part of this request's URL from the protocol name up to the query string in the first line of the HTTP request. getRequestURI() - Method in class javax.servlet.http.HttpServletRequestWrapper The default behavior of this method is to return getRequestURI() on the wrapped request object. getRequestURL() - Method in interface javax.servlet.http.HttpServletRequest Reconstructs the URL the client used to make the request. getRequestURL() - Method in class javax.servlet.http.HttpServletRequestWrapper The default behavior of this method is to return getRequestURL() on the wrapped request object. getRequestURL(HttpServletRequest) - Static method in class javax.servlet.http.HttpUtils Deprecated. Reconstructs the URL the client used to make the request, using information in the HttpServletRequest object. getResource(String) - Method in interface javax.servlet.ServletContext Returns a URL to the resource that is mapped to a specified path. getResourceAsStream(String) - Method in interface javax.servlet.ServletContext Returns the resource located at the named path as an InputStream object. getResourcePaths(String) - Method in interface javax.servlet.ServletContext Returns a directory-like listing of all the paths to resources within the web application whose longest sub-path matches the supplied path argument. getResponse() - Method in class javax.servlet.ServletResponseWrapper Return the wrapped ServletResponse object. getRootCause() - Method in class javax.servlet.ServletException Returns the exception that caused this servlet exception. getScheme() - Method in class javax.servlet.ServletRequestWrapper The default behavior of this method is to return getScheme() on the wrapped request object. getScheme() - Method in interface javax.servlet.ServletRequest Returns the name of the scheme used to make this request, for example, http, https, or ftp. getSecure() - Method in class javax.servlet.http.Cookie Returns true if the browser is sending cookies only over a secure protocol, or false if the browser can send cookies using any protocol. getServerInfo() - Method in interface javax.servlet.ServletContext Returns the name and version of the servlet container on which the servlet is running. getServerName() - Method in class javax.servlet.ServletRequestWrapper The default behavior of this method is to return getServerName() on the wrapped request object. getServerName() - Method in interface javax.servlet.ServletRequest Returns the host name of the server to which the request was sent. getServerPort() - Method in class javax.servlet.ServletRequestWrapper The default behavior of this method is to return getServerPort() on the wrapped request object. getServerPort() - Method in interface javax.servlet.ServletRequest Returns the port number to which the request was sent. getServlet() - Method in class javax.servlet.UnavailableException Deprecated. As of Java Servlet API 2.2, with no replacement. Returns the servlet that is reporting its unavailability. getServlet(String) - Method in interface javax.servlet.ServletContext Deprecated. As of Java Servlet API 2.1, with no direct replacement. This method was originally defined to retrieve a servlet from a ServletContext. In this version, this method always returns null and remains only to preserve binary compatibility. This method will be permanently removed in a future version of the Java Servlet API. In lieu of this method, servlets can share information using the ServletContext class and can perform shared business logic by invoking methods on common non-servlet classes. getServletConfig() - Method in interface javax.servlet.Servlet Returns a ServletConfig object, which contains initialization and startup parameters for this servlet. getServletConfig() - Method in class javax.servlet.GenericServlet Returns this servlet's ServletConfig object. getServletContext() - Method in class javax.servlet.ServletRequestEvent Returns the ServletContext of this web application. getServletContext() - Method in interface javax.servlet.FilterConfig Returns a reference to the ServletContext in which the caller is executing. getServletContext() - Method in interface javax.servlet.ServletConfig Returns a reference to the ServletContext in which the caller is executing. getServletContext() - Method in class javax.servlet.ServletContextEvent Return the ServletContext that changed. getServletContext() - Method in class javax.servlet.GenericServlet Returns a reference to the ServletContext in which this servlet is running. getServletContext() - Method in interface javax.servlet.http.HttpSession Returns the ServletContext to which this session belongs. getServletContextName() - Method in interface javax.servlet.ServletContext Returns the name of this web application corresponding to this ServletContext as specified in the deployment descriptor for this web application by the display-name element. getServletInfo() - Method in interface javax.servlet.Servlet Returns information about the servlet, such as author, version, and copyright. getServletInfo() - Method in class javax.servlet.GenericServlet Returns information about the servlet, such as author, version, and copyright. getServletName() - Method in interface javax.servlet.ServletConfig Returns the name of this servlet instance. getServletName() - Method in class javax.servlet.GenericServlet Returns the name of this servlet instance. getServletNames() - Method in interface javax.servlet.ServletContext Deprecated. As of Java Servlet API 2.1, with no replacement. This method was originally defined to return an Enumeration of all the servlet names known to this context. In this version, this method always returns an empty Enumeration and remains only to preserve binary compatibility. This method will be permanently removed in a future version of the Java Servlet API. getServletPath() - Method in interface javax.servlet.http.HttpServletRequest Returns the part of this request's URL that calls the servlet. getServletPath() - Method in class javax.servlet.http.HttpServletRequestWrapper The default behavior of this method is to return getServletPath() on the wrapped request object. getServletRequest() - Method in class javax.servlet.ServletRequestEvent Returns the ServletRequest that is changing. getServlets() - Method in interface javax.servlet.ServletContext Deprecated. As of Java Servlet API 2.0, with no replacement. This method was originally defined to return an Enumeration of all the servlets known to this servlet context. In this version, this method always returns an empty enumeration and remains only to preserve binary compatibility. This method will be permanently removed in a future version of the Java Servlet API. getSession() - Method in class javax.servlet.http.HttpSessionEvent Return the session that changed. getSession() - Method in class javax.servlet.http.HttpSessionBindingEvent Return the session that changed. getSession() - Method in interface javax.servlet.http.HttpServletRequest Returns the current session associated with this request, or if the request does not have a session, creates one. getSession() - Method in class javax.servlet.http.HttpServletRequestWrapper The default behavior of this method is to return getSession() on the wrapped request object. getSession(boolean) - Method in interface javax.servlet.http.HttpServletRequest Returns the current HttpSession associated with this request or, if there is no current session and create is true, returns a new session. getSession(boolean) - Method in class javax.servlet.http.HttpServletRequestWrapper The default behavior of this method is to return getSession(boolean create) on the wrapped request object. getSession(String) - Method in interface javax.servlet.http.HttpSessionContext Deprecated. As of Java Servlet API 2.1 with no replacement. This method must return null and will be removed in a future version of this API. getSessionContext() - Method in interface javax.servlet.http.HttpSession Deprecated. As of Version 2.1, this method is deprecated and has no replacement. It will be removed in a future version of the Java Servlet API. getUnavailableSeconds() - Method in class javax.servlet.UnavailableException Returns the number of seconds the servlet expects to be temporarily unavailable. getUserPrincipal() - Method in interface javax.servlet.http.HttpServletRequest Returns a java.security.Principal object containing the name of the current authenticated user. getUserPrincipal() - Method in class javax.servlet.http.HttpServletRequestWrapper The default behavior of this method is to return getUserPrincipal() on the wrapped request object. getValue() - Method in class javax.servlet.ServletContextAttributeEvent Returns the value of the attribute that has been added, removed, or replaced. getValue() - Method in class javax.servlet.ServletRequestAttributeEvent Returns the value of the attribute that has been added, removed or replaced. getValue() - Method in class javax.servlet.http.HttpSessionBindingEvent Returns the value of the attribute that has been added, removed or replaced. getValue() - Method in class javax.servlet.http.Cookie Returns the value of the cookie. getValue(String) - Method in interface javax.servlet.http.HttpSession Deprecated. As of Version 2.2, this method is replaced by HttpSession.getAttribute(java.lang.String). getValueNames() - Method in interface javax.servlet.http.HttpSession Deprecated. As of Version 2.2, this method is replaced by HttpSession.getAttributeNames() getVersion() - Method in class javax.servlet.http.Cookie Returns the version of the protocol this cookie complies with. getWriter() - Method in interface javax.servlet.ServletResponse Returns a PrintWriter object that can send character text to the client. getWriter() - Method in class javax.servlet.ServletResponseWrapper The default behavior of this method is to return getWriter() on the wrapped response object. -------------------------------------------------------------------------------- H HttpServlet - class javax.servlet.http.HttpServlet. Provides an abstract class to be subclassed to create an HTTP servlet suitable for a Web site. HttpServlet() - Constructor for class javax.servlet.http.HttpServlet Does nothing, because this is an abstract class. HttpServletRequest - interface javax.servlet.http.HttpServletRequest. Extends the ServletRequest interface to provide request information for HTTP servlets. HttpServletRequestWrapper - class javax.servlet.http.HttpServletRequestWrapper. Provides a convenient implementation of the HttpServletRequest interface that can be subclassed by developers wishing to adapt the request to a Servlet. HttpServletRequestWrapper(HttpServletRequest) - Constructor for class javax.servlet.http.HttpServletRequestWrapper Constructs a request object wrapping the given request. HttpServletResponse - interface javax.servlet.http.HttpServletResponse. Extends the ServletResponse interface to provide HTTP-specific functionality in sending a response. HttpServletResponseWrapper - class javax.servlet.http.HttpServletResponseWrapper. Provides a convenient implementation of the HttpServletResponse interface that can be subclassed by developers wishing to adapt the response from a Servlet. HttpServletResponseWrapper(HttpServletResponse) - Constructor for class javax.servlet.http.HttpServletResponseWrapper Constructs a response adaptor wrapping the given response. HttpSession - interface javax.servlet.http.HttpSession. Provides a way to identify a user across more than one page request or visit to a Web site and to store information about that user. HttpSessionActivationListener - interface javax.servlet.http.HttpSessionActivationListener. Objects that are bound to a session may listen to container events notifying them that sessions will be passivated and that session will be activated. HttpSessionAttributeListener - interface javax.servlet.http.HttpSessionAttributeListener. This listener interface can be implemented in order to get notifications of changes to the attribute lists of sessions within this web application. HttpSessionBindingEvent - class javax.servlet.http.HttpSessionBindingEvent. Events of this type are either sent to an object that implements HttpSessionBindingListener when it is bound or unbound from a session, or to a HttpSessionAttributeListener that has been configured in the deployment descriptor when any attribute is bound, unbound or replaced in a session. HttpSessionBindingEvent(HttpSession, String) - Constructor for class javax.servlet.http.HttpSessionBindingEvent Constructs an event that notifies an object that it has been bound to or unbound from a session. HttpSessionBindingEvent(HttpSession, String, Object) - Constructor for class javax.servlet.http.HttpSessionBindingEvent Constructs an event that notifies an object that it has been bound to or unbound from a session. HttpSessionBindingListener - interface javax.servlet.http.HttpSessionBindingListener. Causes an object to be notified when it is bound to or unbound from a session. HttpSessionContext - interface javax.servlet.http.HttpSessionContext. Deprecated. As of Java(tm) Servlet API 2.1 for security reasons, with no replacement. This interface will be removed in a future version of this API. HttpSessionEvent - class javax.servlet.http.HttpSessionEvent. This is the class representing event notifications for changes to sessions within a web application. HttpSessionEvent(HttpSession) - Constructor for class javax.servlet.http.HttpSessionEvent Construct a session event from the given source. HttpSessionListener - interface javax.servlet.http.HttpSessionListener. Implementations of this interface are notified of changes to the list of active sessions in a web application. HttpUtils - class javax.servlet.http.HttpUtils. Deprecated. As of Java(tm) Servlet API 2.3. These methods were only useful with the default encoding and have been moved to the request interfaces. HttpUtils() - Constructor for class javax.servlet.http.HttpUtils Deprecated. Constructs an empty HttpUtils object. -------------------------------------------------------------------------------- I include(ServletRequest, ServletResponse) - Method in interface javax.servlet.RequestDispatcher Includes the content of a resource (servlet, JSP page, HTML file) in the response. init() - Method in class javax.servlet.GenericServlet A convenience method which can be overridden so that there's no need to call super.init(config). init(FilterConfig) - Method in interface javax.servlet.Filter Called by the web container to indicate to a filter that it is being placed into service. init(ServletConfig) - Method in interface javax.servlet.Servlet Called by the servlet container to indicate to a servlet that the servlet is being placed into service. init(ServletConfig) - Method in class javax.servlet.GenericServlet Called by the servlet container to indicate to a servlet that the servlet is being placed into service. invalidate() - Method in interface javax.servlet.http.HttpSession Invalidates this session then unbinds any objects bound to it. isCommitted() - Method in interface javax.servlet.ServletResponse Returns a boolean indicating if the response has been committed. isCommitted() - Method in class javax.servlet.ServletResponseWrapper The default behavior of this method is to return isCommitted() on the wrapped response object. isNew() - Method in interface javax.servlet.http.HttpSession Returns true if the client does not yet know about the session or if the client chooses not to join the session. isPermanent() - Method in class javax.servlet.UnavailableException Returns a boolean indicating whether the servlet is permanently unavailable. isRequestedSessionIdFromCookie() - Method in interface javax.servlet.http.HttpServletRequest Checks whether the requested session ID came in as a cookie. isRequestedSessionIdFromCookie() - Method in class javax.servlet.http.HttpServletRequestWrapper The default behavior of this method is to return isRequestedSessionIdFromCookie() on the wrapped request object. isRequestedSessionIdFromUrl() - Method in interface javax.servlet.http.HttpServletRequest Deprecated. As of Version 2.1 of the Java Servlet API, use HttpServletRequest.isRequestedSessionIdFromURL() instead. isRequestedSessionIdFromUrl() - Method in class javax.servlet.http.HttpServletRequestWrapper The default behavior of this method is to return isRequestedSessionIdFromUrl() on the wrapped request object. isRequestedSessionIdFromURL() - Method in interface javax.servlet.http.HttpServletRequest Checks whether the requested session ID came in as part of the request URL. isRequestedSessionIdFromURL() - Method in class javax.servlet.http.HttpServletRequestWrapper The default behavior of this method is to return isRequestedSessionIdFromURL() on the wrapped request object. isRequestedSessionIdValid() - Method in interface javax.servlet.http.HttpServletRequest Checks whether the requested session ID is still valid. isRequestedSessionIdValid() - Method in class javax.servlet.http.HttpServletRequestWrapper The default behavior of this method is to return isRequestedSessionIdValid() on the wrapped request object. isSecure() - Method in class javax.servlet.ServletRequestWrapper The default behavior of this method is to return isSecure() on the wrapped request object. isSecure() - Method in interface javax.servlet.ServletRequest Returns a boolean indicating whether this request was made using a secure channel, such as HTTPS. isUserInRole(String) - Method in interface javax.servlet.http.HttpServletRequest Returns a boolean indicating whether the authenticated user is included in the specified logical "role". isUserInRole(String) - Method in class javax.servlet.http.HttpServletRequestWrapper The default behavior of this method is to return isUserInRole(String role) on the wrapped request object. -------------------------------------------------------------------------------- J javax.servlet - package javax.servlet This chapter describes the javax.servlet package. javax.servlet.http - package javax.servlet.http This chapter describes the javax.servlet.http package. -------------------------------------------------------------------------------- L log(Exception, String) - Method in interface javax.servlet.ServletContext Deprecated. As of Java Servlet API 2.1, use ServletContext.log(String message, Throwable throwable) instead. This method was originally defined to write an exception's stack trace and an explanatory error message to the servlet log file. log(String) - Method in interface javax.servlet.ServletContext Writes the specified message to a servlet log file, usually an event log. log(String) - Method in class javax.servlet.GenericServlet Writes the specified message to a servlet log file, prepended by the servlet's name. log(String, Throwable) - Method in interface javax.servlet.ServletContext Writes an explanatory message and a stack trace for a given Throwable exception to the servlet log file. log(String, Throwable) - Method in class javax.servlet.GenericServlet Writes an explanatory message and a stack trace for a given Throwable exception to the servlet log file, prepended by the servlet's name. -------------------------------------------------------------------------------- P parsePostData(int, ServletInputStream) - Static method in class javax.servlet.http.HttpUtils Deprecated. Parses data from an HTML form that the client sends to the server using the HTTP POST method and the application/x-www-form-urlencoded MIME type. parseQueryString(String) - Static method in class javax.servlet.http.HttpUtils Deprecated. Parses a query string passed from the client to the server and builds a HashTable object with key-value pairs. print(boolean) - Method in class javax.servlet.ServletOutputStream Writes a boolean value to the client, with no carriage return-line feed (CRLF) character at the end. print(char) - Method in class javax.servlet.ServletOutputStream Writes a character to the client, with no carriage return-line feed (CRLF) at the end. print(double) - Method in class javax.servlet.ServletOutputStream Writes a double value to the client, with no carriage return-line feed (CRLF) at the end. print(float) - Method in class javax.servlet.ServletOutputStream Writes a float value to the client, with no carriage return-line feed (CRLF) at the end. print(int) - Method in class javax.servlet.ServletOutputStream Writes an int to the client, with no carriage return-line feed (CRLF) at the end. print(long) - Method in class javax.servlet.ServletOutputStream Writes a long value to the client, with no carriage return-line feed (CRLF) at the end. print(String) - Method in class javax.servlet.ServletOutputStream Writes a String to the client, without a carriage return-line feed (CRLF) character at the end. println() - Method in class javax.servlet.ServletOutputStream Writes a carriage return-line feed (CRLF) to the client. println(boolean) - Method in class javax.servlet.ServletOutputStream Writes a boolean value to the client, followed by a carriage return-line feed (CRLF). println(char) - Method in class javax.servlet.ServletOutputStream Writes a character to the client, followed by a carriage return-line feed (CRLF). println(double) - Method in class javax.servlet.ServletOutputStream Writes a double value to the client, followed by a carriage return-line feed (CRLF). println(float) - Method in class javax.servlet.ServletOutputStream Writes a float value to the client, followed by a carriage return-line feed (CRLF). println(int) - Method in class javax.servlet.ServletOutputStream Writes an int to the client, followed by a carriage return-line feed (CRLF) character. println(long) - Method in class javax.servlet.ServletOutputStream Writes a long value to the client, followed by a carriage return-line feed (CRLF). println(String) - Method in class javax.servlet.ServletOutputStream Writes a String to the client, followed by a carriage return-line feed (CRLF). putValue(String, Object) - Method in interface javax.servlet.http.HttpSession Deprecated. As of Version 2.2, this method is replaced by HttpSession.setAttribute(java.lang.String, java.lang.Object) -------------------------------------------------------------------------------- R readLine(byte[], int, int) - Method in class javax.servlet.ServletInputStream Reads the input stream, one line at a time. removeAttribute(String) - Method in interface javax.servlet.ServletContext Removes the attribute with the given name from the servlet context. removeAttribute(String) - Method in class javax.servlet.ServletRequestWrapper The default behavior of this method is to call removeAttribute(String name) on the wrapped request object. removeAttribute(String) - Method in interface javax.servlet.ServletRequest Removes an attribute from this request. removeAttribute(String) - Method in interface javax.servlet.http.HttpSession Removes the object bound with the specified name from this session. removeValue(String) - Method in interface javax.servlet.http.HttpSession Deprecated. As of Version 2.2, this method is replaced by HttpSession.removeAttribute(java.lang.String) requestDestroyed(ServletRequestEvent) - Method in interface javax.servlet.ServletRequestListener The request is about to go out of scope of the web application. RequestDispatcher - interface javax.servlet.RequestDispatcher. Defines an object that receives requests from the client and sends them to any resource (such as a servlet, HTML file, or JSP file) on the server. requestInitialized(ServletRequestEvent) - Method in interface javax.servlet.ServletRequestListener The request is about to come into scope of the web application. reset() - Method in interface javax.servlet.ServletResponse Clears any data that exists in the buffer as well as the status code and headers. reset() - Method in class javax.servlet.ServletResponseWrapper The default behavior of this method is to call reset() on the wrapped response object. resetBuffer() - Method in interface javax.servlet.ServletResponse Clears the content of the underlying buffer in the response without clearing headers or status code. resetBuffer() - Method in class javax.servlet.ServletResponseWrapper The default behavior of this method is to call resetBuffer() on the wrapped response object. -------------------------------------------------------------------------------- S SC_ACCEPTED - Static variable in interface javax.servlet.http.HttpServletResponse Status code (202) indicating that a request was accepted for processing, but was not completed. SC_BAD_GATEWAY - Static variable in interface javax.servlet.http.HttpServletResponse Status code (502) indicating that the HTTP server received an invalid response from a server it consulted when acting as a proxy or gateway. SC_BAD_REQUEST - Static variable in interface javax.servlet.http.HttpServletResponse Status code (400) indicating the request sent by the client was syntactically incorrect. SC_CONFLICT - Static variable in interface javax.servlet.http.HttpServletResponse Status code (409) indicating that the request could not be completed due to a conflict with the current state of the resource. SC_CONTINUE - Static variable in interface javax.servlet.http.HttpServletResponse Status code (100) indicating the client can continue. SC_CREATED - Static variable in interface javax.servlet.http.HttpServletResponse Status code (201) indicating the request succeeded and created a new resource on the server. SC_EXPECTATION_FAILED - Static variable in interface javax.servlet.http.HttpServletResponse Status code (417) indicating that the server could not meet the expectation given in the Expect request header. SC_FORBIDDEN - Static variable in interface javax.servlet.http.HttpServletResponse Status code (403) indicating the server understood the request but refused to fulfill it. SC_FOUND - Static variable in interface javax.servlet.http.HttpServletResponse Status code (302) indicating that the resource reside temporarily under a different URI. SC_GATEWAY_TIMEOUT - Static variable in interface javax.servlet.http.HttpServletResponse Status code (504) indicating that the server did not receive a timely response from the upstream server while acting as a gateway or proxy. SC_GONE - Static variable in interface javax.servlet.http.HttpServletResponse Status code (410) indicating that the resource is no longer available at the server and no forwarding address is known. SC_HTTP_VERSION_NOT_SUPPORTED - Static variable in interface javax.servlet.http.HttpServletResponse Status code (505) indicating that the server does not support or refuses to support the HTTP protocol version that was used in the request message. SC_INTERNAL_SERVER_ERROR - Static variable in interface javax.servlet.http.HttpServletResponse Status code (500) indicating an error inside the HTTP server which prevented it from fulfilling the request. SC_LENGTH_REQUIRED - Static variable in interface javax.servlet.http.HttpServletResponse Status code (411) indicating that the request cannot be handled without a defined Content-Length. SC_METHOD_NOT_ALLOWED - Static variable in interface javax.servlet.http.HttpServletResponse Status code (405) indicating that the method specified in the Request-Line is not allowed for the resource identified by the Request-URI. SC_MOVED_PERMANENTLY - Static variable in interface javax.servlet.http.HttpServletResponse Status code (301) indicating that the resource has permanently moved to a new location, and that future references should use a new URI with their requests. SC_MOVED_TEMPORARILY - Static variable in interface javax.servlet.http.HttpServletResponse Status code (302) indicating that the resource has temporarily moved to another location, but that future references should still use the original URI to access the resource. SC_MULTIPLE_CHOICES - Static variable in interface javax.servlet.http.HttpServletResponse Status code (300) indicating that the requested resource corresponds to any one of a set of representations, each with its own specific location. SC_NO_CONTENT - Static variable in interface javax.servlet.http.HttpServletResponse Status code (204) indicating that the request succeeded but that there was no new information to return. SC_NON_AUTHORITATIVE_INFORMATION - Static variable in interface javax.servlet.http.HttpServletResponse Status code (203) indicating that the meta information presented by the client did not originate from the server. SC_NOT_ACCEPTABLE - Static variable in interface javax.servlet.http.HttpServletResponse Status code (406) indicating that the resource identified by the request is only capable of generating response entities which have content characteristics not acceptable according to the accept headers sent in the request. SC_NOT_FOUND - Static variable in interface javax.servlet.http.HttpServletResponse Status code (404) indicating that the requested resource is not available. SC_NOT_IMPLEMENTED - Static variable in interface javax.servlet.http.HttpServletResponse Status code (501) indicating the HTTP server does not support the functionality needed to fulfill the request. SC_NOT_MODIFIED - Static variable in interface javax.servlet.http.HttpServletResponse Status code (304) indicating that a conditional GET operation found that the resource was available and not modified. SC_OK - Static variable in interface javax.servlet.http.HttpServletResponse Status code (200) indicating the request succeeded normally. SC_PARTIAL_CONTENT - Static variable in interface javax.servlet.http.HttpServletResponse Status code (206) indicating that the server has fulfilled the partial GET request for the resource. SC_PAYMENT_REQUIRED - Static variable in interface javax.servlet.http.HttpServletResponse Status code (402) reserved for future use. SC_PRECONDITION_FAILED - Static variable in interface javax.servlet.http.HttpServletResponse Status code (412) indicating that the precondition given in one or more of the request-header fields evaluated to false when it was tested on the server. SC_PROXY_AUTHENTICATION_REQUIRED - Static variable in interface javax.servlet.http.HttpServletResponse Status code (407) indicating that the client MUST first authenticate itself with the proxy. SC_REQUEST_ENTITY_TOO_LARGE - Static variable in interface javax.servlet.http.HttpServletResponse Status code (413) indicating that the server is refusing to process the request because the request entity is larger than the server is willing or able to process. SC_REQUEST_TIMEOUT - Static variable in interface javax.servlet.http.HttpServletResponse Status code (408) indicating that the client did not produce a request within the time that the server was prepared to wait. SC_REQUEST_URI_TOO_LONG - Static variable in interface javax.servlet.http.HttpServletResponse Status code (414) indicating that the server is refusing to service the request because the Request-URI is longer than the server is willing to interpret. SC_REQUESTED_RANGE_NOT_SATISFIABLE - Static variable in interface javax.servlet.http.HttpServletResponse Status code (416) indicating that the server cannot serve the requested byte range. SC_RESET_CONTENT - Static variable in interface javax.servlet.http.HttpServletResponse Status code (205) indicating that the agent SHOULD reset the document view which caused the request to be sent. SC_SEE_OTHER - Static variable in interface javax.servlet.http.HttpServletResponse Status code (303) indicating that the response to the request can be found under a different URI. SC_SERVICE_UNAVAILABLE - Static variable in interface javax.servlet.http.HttpServletResponse Status code (503) indicating that the HTTP server is temporarily overloaded, and unable to handle the request. SC_SWITCHING_PROTOCOLS - Static variable in interface javax.servlet.http.HttpServletResponse Status code (101) indicating the server is switching protocols according to Upgrade header. SC_TEMPORARY_REDIRECT - Static variable in interface javax.servlet.http.HttpServletResponse Status code (307) indicating that the requested resource resides temporarily under a different URI. SC_UNAUTHORIZED - Static variable in interface javax.servlet.http.HttpServletResponse Status code (401) indicating that the request requires HTTP authentication. SC_UNSUPPORTED_MEDIA_TYPE - Static variable in interface javax.servlet.http.HttpServletResponse Status code (415) indicating that the server is refusing to service the request because the entity of the request is in a format not supported by the requested resource for the requested method. SC_USE_PROXY - Static variable in interface javax.servlet.http.HttpServletResponse Status code (305) indicating that the requested resource MUST be accessed through the proxy given by the Location field. sendError(int) - Method in class javax.servlet.http.HttpServletResponseWrapper The default behavior of this method is to call sendError(int sc) on the wrapped response object. sendError(int) - Method in interface javax.servlet.http.HttpServletResponse Sends an error response to the client using the specified status code and clearing the buffer. sendError(int, String) - Method in class javax.servlet.http.HttpServletResponseWrapper The default behavior of this method is to call sendError(int sc, String msg) on the wrapped response object. sendError(int, String) - Method in interface javax.servlet.http.HttpServletResponse Sends an error response to the client using the specified status. sendRedirect(String) - Method in class javax.servlet.http.HttpServletResponseWrapper The default behavior of this method is to return sendRedirect(String location) on the wrapped response object. sendRedirect(String) - Method in interface javax.servlet.http.HttpServletResponse Sends a temporary redirect response to the client using the specified redirect location URL. service(HttpServletRequest, HttpServletResponse) - Method in class javax.servlet.http.HttpServlet Receives standard HTTP requests from the public service method and dispatches them to the doXXX methods defined in this class. service(ServletRequest, ServletResponse) - Method in interfac

Contents Overview 1 Lesson 1: Index Concepts 3 Lesson 2: Concepts – Statistics 29 Lesson 3: Concepts – Query Optimization 37 Lesson 4: Information Collection and Analysis 61 Lesson 5: Formulating and Implementing Resolution 75 Module 6: Troubleshooting Query Performance Overview At the end of this module, you will be able to:  Describe the different types of indexes and how indexes can be used to improve performance.  Describe what statistics are used for and how they can help in optimizing query performance.  Describe how queries are optimized.  Analyze the information collected from various tools.  Formulate resolution to query performance problems. Lesson 1: Index Concepts Indexes are the most useful tool for improving query performance. Without a useful index, Microsoft® SQL Server™ must search every row on every page in table to find the rows to return. With a multitable query, SQL Server must sometimes search a table multiple times so each page is scanned much more than once. Having useful indexes speeds up finding individual rows in a table, as well as finding the matching rows needed to join two tables. What You Will Learn After completing this lesson, you will be able to:  Understand the structure of SQL Server indexes.  Describe how SQL Server uses indexes to find rows.  Describe how fillfactor can impact the performance of data retrieval and insertion.  Describe the different types of fragmentation that can occur within an index. Recommended Reading  Chapter 8: “Indexes”, Inside SQL Server 2000 by Kalen Delaney  Chapter 11: “Batches, Stored Procedures and Functions”, Inside SQL Server 2000 by Kalen Delaney Finding Rows without Indexes With No Indexes, A Table Must Be Scanned SQL Server keeps track of which pages belong to a table or index by using IAM pages. If there is no clustered index, there is a sysindexes row for the table with an indid value of 0, and that row will keep track of the address of the first IAM for the table. The IAM is a giant bitmap, and every 1 bit indicates that the corresponding extent belongs to the table. The IAM allows SQL Server to do efficient prefetching of the table’s extents, but every row still must be examined. General Index Structure All SQL Server Indexes Are Organized As B-Trees Indexes in SQL Server store their information using standard B-trees. A B-tree provides fast access to data by searching on a key value of the index. B-trees cluster records with similar keys. The B stands for balanced, and balancing the tree is a core feature of a B-tree’s usefulness. The trees are managed, and branches are grafted as necessary, so that navigating down the tree to find a value and locate a specific record takes only a few page accesses. Because the trees are balanced, finding any record requires about the same amount of resources, and retrieval speed is consistent because the index has the same depth throughout. Clustered and Nonclustered Indexes Both Index Types Have Many Common Features An index consists of a tree with a root from which the navigation begins, possible intermediate index levels, and bottom-level leaf pages. You use the index to find the correct leaf page. The number of levels in an index will vary depending on the number of rows in the table and the size of the key column or columns for the index. If you create an index using a large key, fewer entries will fit on a page, so more pages (and possibly more levels) will be needed for the index. On a qualified select, update, or delete, the correct leaf page will be the lowest page of the tree in which one or more rows with the specified key or keys reside. A qualified operation is one that affects only specific rows that satisfy the conditions of a WHERE clause, as opposed to accessing the whole table. An index can have multiple node levels An index page above the leaf is called a node page. Each index row in node pages contains an index key (or set of keys for a composite index) and a pointer to a page at the next level for which the first key value is the same as the key value in the current index row. Leaf Level contains all key values In any index, whether clustered or nonclustered, the leaf level contains every key value, in key sequence. In SQL Server 2000, the sequence can be either ascending or descending. The sysindexes table contains all sizing, location and distribution information Any information about size of indexes or tables is stored in sysindexes. The only source of any storage location information is the sysindexes table, which keeps track of the address of the root page for every index, and the first IAM page for the index or table. There is also a column for the first page of the table, but this is not guaranteed to be reliable. SQL Server can find all pages belonging to an index or table by examining the IAM pages. Sysindexes contains a pointer to the first IAM page, and each IAM page contains a pointer to the next one. The Difference between Clustered and Nonclustered Indexes The main difference between the two types of indexes is how much information is stored at the leaf. The leaf levels of both types of indexes contain all the key values in order, but they also contain other information. Clustered Indexes The Leaf Level of a Clustered Index Is the Data The leaf level of a clustered index contains the data pages, not just the index keys. Another way to say this is that the data itself is part of the clustered index. A clustered index keeps the data in a table ordered around the key. The data pages in the table are kept in a doubly linked list called the page chain. The order of pages in the page chain, and the order of rows on the data pages, is the order of the index key or keys. Deciding which key to cluster on is an important performance consideration. When the index is traversed to the leaf level, the data itself has been retrieved, not simply pointed to. Uniqueness Is Maintained In Key Values In SQL Server 2000, all clustered indexes are unique. If you build a clustered index without specifying the unique keyword, SQL Server forces uniqueness by adding a uniqueifier to the rows when necessary. This uniqueifier is a 4-byte value added as an additional sort key to only the rows that have duplicates of their primary sort key. You can see this extra value if you use DBCC PAGE to look at the actual index rows the section on indexes internal. . Finding Rows in a Clustered Index The Leaf Level of a Clustered Index Contains the Data A clustered index is like a telephone directory in which all of the rows for customers with the same last name are clustered together in the same part of the book. Just as the organization of a telephone directory makes it easy for a person to search, SQL Server quickly searches a table with a clustered index. Because a clustered index determines the sequence in which rows are stored in a table, there can only be one clustered index for a table at a time. Performance Considerations Keeping your clustered key value small increases the number of index rows that can be placed on an index page and decreases the number of levels that must be traversed. This minimizes I/O. As we’ll see, the clustered key is duplicated in every nonclustered index row, so keeping your clustered key small will allow you to have more index fit per page in all your indexes. Note The query corresponding to the slide is: SELECT lastname, firstname FROM member WHERE lastname = ‘Ota’ Nonclustered Indexes The Leaf Level of a Nonclustered Index Contains a Bookmark A nonclustered index is like the index of a textbook. The data is stored in one place and the index is stored in another. Pointers indicate the storage location of the indexed items in the underlying table. In a nonclustered index, the leaf level contains each index key, plus a bookmark that tells SQL Server where to find the data row corresponding to the key in the index. A bookmark can take one of two forms:  If the table has a clustered index, the bookmark is the clustered index key for the corresponding data row. This clustered key can be multiple column if the clustered index is composite, or is defined to be non-unique.  If the table is a heap (in other words, it has no clustered index), the bookmark is a RID, which is an actual row locator in the form File#:Page#:Slot#. Finding Rows with a NC Index on a Heap Nonclustered Indexes Are Very Efficient When Searching For A Single Row After the nonclustered key at the leaf level of the index is found, only one more page access is needed to find the data row. Searching for a single row using a nonclustered index is almost as efficient as searching for a single row in a clustered index. However, if we are searching for multiple rows, such as duplicate values, or keys in a range, anything more than a small number of rows will make the nonclustered index search very inefficient. Note The query corresponding to the slide is: SELECT lastname, firstname FROM member WHERE lastname BETWEEN ‘Master’ AND ‘Rudd’ Finding Rows with a NC Index on a Clustered Table A Clustered Key Is Used as the Bookmark for All Nonclustered Indexes If the table has a clustered index, all columns of the clustered key will be duplicated in the nonclustered index leaf rows, unless there is overlap between the clustered and nonclustered key. For example, if the clustered index is on (lastname, firstname) and a nonclustered index is on firstname, the firstname value will not be duplicated in the nonclustered index leaf rows. Note The query corresponding to the slide is: SELECT lastname, firstname, phone FROM member WHERE firstname = ‘Mike’ Covering Indexes A Covering Index Provides the Fastest Data Access A covering index contains ALL the fields accessed in the query. Normally, only the columns in the WHERE clause are helpful in determining useful indexes, but for a covering index, all columns must be included. If all columns needed for the query are in the index, SQL Server never needs to access the data pages. If even one column in the query is not part of the index, the data rows must be accessed. The leaf level of an index is the only level that contains every key value, or set of key values. For a clustered index, the leaf level is the data itself, so in reality, a clustered index ALWAYS covers any query. Nevertheless, for most of our optimization discussions, we only consider nonclustered indexes. Scanning the leaf level of a nonclustered index is almost always faster than scanning a clustered index, so covering indexes are particular valuable when we need ALL the key values of a particular nonclustered index. Example: Select an aggregate value of a column with a clustered index. Suppose we have a nonclustered index on price, this query is covered: SELECT avg(price) from titles Since the clustered key is included in every nonclustered index row, the clustered key can be included in the covering. Suppose you have a nonclustered index on price and a clustered index on title_id; then this query is covered: SELECT title_id, price FROM titles WHERE price between 10 and 20 Performance Considerations In general, you do want to keep your indexes narrow. However, if you have a critical query that just is not giving you satisfactory performance no matter what you do, you should consider creating an index to cover it, or adding one or two extra columns to an existing index, so that the query will be covered. The leaf level of a nonclustered index is like a ‘mini’ clustered index, so you can have most of the benefits of clustering, even if there already is another clustered index on the table. The tradeoff to adding more, wider indexes for covering queries are the added disk space, and more overhead for updating those columns that are now part of the index. Bug In general, SQL Server will detect when a query is covered, and detect the possible covering indexes. However, in some cases, you must force SQL Server to use a covering index by including a WHERE clause, even if the WHERE clause will return ALL the rows in the table. This is SHILOH bug #352079 Steps to reproduce 1. Make copy of orders table from Northwind: USE Northwind CREATE TABLE [NewOrders] ( [OrderID] [int] NOT NULL , [CustomerID] [nchar] (5) NULL , [EmployeeID] [int] NULL , [OrderDate] [datetime] NULL , [RequiredDate] [datetime] NULL , [ShippedDate] [datetime] NULL , [ShipVia] [int] NULL , [Freight] [money] NULL , [ShipName] [nvarchar] (40) NULL, [ShipAddress] [nvarchar] (60) , [ShipCity] [nvarchar] (15) NULL, [ShipRegion] [nvarchar] (15) NULL, [ShipPostalCode] [nvarchar] (10) NULL, [ShipCountry] [nvarchar] (15) NULL ) INSERT into NewOrders SELECT * FROM Orders 2. Build nc index on OrderDate: create index dateindex on neworders(orderdate) 3. Test Query by looking at query plan: select orderdate from NewOrders The index is being scanned, as expected. 4. Build an index on orderId: create index orderid_index on neworders(orderID) 5. Test Query by looking at query plan: select orderdate from NewOrders Now the TABLE is being scanned, instead of the original index! Index Intersection Multiple Indexes Can Be Used On A Single Table In versions prior to SQL Server 7, only one index could be used for any table to process any single query. The only exception was a query involving an OR. In current SQL Server versions, multiple nonclustered indexes can each be accessed, retrieving a set of keys with bookmarks, and then the result sets can be joined on the common bookmarks. The optimizer weighs the cost of performing the unindexed join on the intermediate result sets, with the cost of only using one index, and then scanning the entire result set from that single index. Fillfactor and Performance Creating an Index with a Low Fillfactor Delays Page Splits when Inserting DBCC SHOWCONTIG will show you a low value for “Avg. Page Density” when a low fillfactor has been specified. This is good for inserts and updates, because it will delay the need to split pages to make room for new rows. It can be bad for scans, because fewer rows will be on each page, and more pages must be read to access the same amount of data. However, this cost will be minimal if the scan density value is good. Index Reorganization DBCC SHOWCONTIG Provides Lots of Information Here’s some sample output from running a basic DBCC SHOWCONTIG on the order details table in the Northwind database: DBCC SHOWCONTIG scanning 'Order Details' table... Table: 'Order Details' (325576198); index ID: 1, database ID:6 TABLE level scan performed. - Pages Scanned................................: 9 - Extents Scanned..............................: 6 - Extent Switches..............................: 5 - Avg. Pages per Extent........................: 1.5 - Scan Density [Best Count:Actual Count].......: 33.33% [2:6] - Logical Scan Fragmentation ..................: 0.00% - Extent Scan Fragmentation ...................: 16.67% - Avg. Bytes Free per Page.....................: 673.2 - Avg. Page Density (full).....................: 91.68% By default, DBCC SHOWCONTIG scans the page chain at the leaf level of the specified index and keeps track of the following values:  Average number of bytes free on each page (Avg. Bytes Free per Page)  Number of pages accessed (Pages scanned)  Number of extents accessed (Extents scanned)  Number of times a page had a lower page number than the previous page in the scan (This value for Out of order pages is not displayed, but is used for additional computations.)  Number of times a page in the scan was on a different extent than the previous page in the scan (Extent switches) SQL Server also keeps track of all the extents that have been accessed, and then it determines how many gaps are in the used extents. An extent is identified by the page number of its first page. So, if extents 8, 16, 24, 32, and 40 make up an index, there are no gaps. If the extents are 8, 16, 24, and 40, there is one gap. The value in DBCC SHOWCONTIG’s output called Extent Scan Fragmentation is computed by dividing the number of gaps by the number of extents, so in this example the Extent Scan Fragmentation is ¼, or 25 percent. A table using extents 8, 24, 40, and 56 has three gaps, and its Extent Scan Fragmentation is ¾, or 75 percent. The maximum number of gaps is the number of extents - 1, so Extent Scan Fragmentation can never be 100 percent. The value in DBCC SHOWCONTIG’s output called Logical Scan Fragmentation is computed by dividing the number of Out of order pages by the number of pages in the table. This value is meaningless in a heap. You can use either the Extent Scan Fragmentation value or the Logical Scan Fragmentation value to determine the general level of fragmentation in a table. The lower the value, the less fragmentation there is. Alternatively, you can use the value called Scan Density, which is computed by dividing the optimum number of extent switches by the actual number of extent switches. A high value means that there is little fragmentation. Scan Density is not valid if the table spans multiple files; therefore, it is less useful than the other values. SQL Server 2000 allows online defragmentation You can choose from several methods for removing fragmentation from an index. You could rebuild the index and have SQL Server allocate all new contiguous pages for you. To rebuild the index, you can use a simple DROP INDEX and CREATE INDEX combination, but in many cases using these commands is less than optimal. In particular, if the index is supporting a constraint, you cannot use the DROP INDEX command. Alternatively, you can use DBCC DBREINDEX, which can rebuild all the indexes on a table in one operation, or you can use the drop_existing clause along with CREATE INDEX. The drawback of these methods is that the table is unavailable while SQL Server is rebuilding the index. When you are rebuilding only nonclustered indexes, SQL Server takes a shared lock on the table, which means that users cannot make modifications, but other processes can SELECT from the table. Of course, those SELECT queries cannot take advantage of the index you are rebuilding, so they might not perform as well as they would otherwise. If you are rebuilding a clustered index, SQL Server takes an exclusive lock and does not allow access to the table, so your data is temporarily unavailable. SQL Server 2000 lets you defragment an index without completely rebuilding it. DBCC INDEXDEFRAG reorders the leaf-level pages into physical order as well as logical order, but using only the pages that are already allocated to the leaf level. This command does an in-place ordering, which is similar to a sorting technique called bubble sort (you might be familiar with this technique if you've studied and compared various sorting algorithms). In-place ordering can reduce logical fragmentation to 2 percent or less, making an ordered scan through the leaf level much faster. DBCC INDEXDEFRAG also compacts the pages of an index, based on the original fillfactor. The pages will not always end up with the original fillfactor, but SQL Server uses that value as a goal. The defragmentation process attempts to leave at least enough space for one average-size row on each page. In addition, if SQL Server cannot obtain a lock on a page during the compaction phase of DBCC INDEXDEFRAG, it skips the page and does not return to it. Any empty pages created as a result of compaction are removed. The algorithm SQL Server 2000 uses for DBCC INDEXDEFRAG finds the next physical page in a file belonging to the index's leaf level and the next logical page in the leaf level to swap it with. To find the next physical page, the algorithm scans the IAM pages belonging to that index. In a database spanning multiple files, in which a table or index has pages on more than one file, SQL Server handles pages on different files separately. SQL Server finds the next logical page by scanning the index's leaf level. After each page move, SQL Server drops all locks and saves the last key on the last page it moved. The next iteration of the algorithm uses the last key to find the next logical page. This process lets other users update the table and index while DBCC INDEXDEFRAG is running. Let us look at an example in which an index's leaf level consists of the following pages in the following logical order: 47 22 83 32 12 90 64 The first key is on page 47, and the last key is on page 64. SQL Server would have to scan the pages in this order to retrieve the data in sorted order. As its first step, DBCC INDEXDEFRAG would find the first physical page, 12, and the first logical page, 47. It would then swap the pages, using a temporary buffer as a holding area. After the first swap, the leaf level would look like this: 12 22 83 32 47 90 64 The next physical page is 22, which is also the next logical page, so no work would be necessary. DBCC INDEXDEFRAG would then swap the next physical page, 32, with the next logical page, 83: 12 22 32 83 47 90 64 After the next swap of 47 with 83, the leaf level would look like this: 12 22 32 47 83 90 64 Then, the defragmentation process would swap 64 with 83: 12 22 32 47 64 90 83 and 83 with 90: 12 22 32 47 64 83 90 At the end of the DBCC INDEXDEFRAG operation, the pages in the table or index are not contiguous, but their logical order matches their physical order. Now, if the pages were accessed from disk in sorted order, the head would need to move in only one direction. Keep in mind that DBCC INDEXDEFRAG uses only pages that are already part of the index's leaf level; it allocates no new pages. In addition, defragmenting a large table can take quite a while, and you will get a report every 5 minutes about the estimated percentage completed. However, except for the locks on the pages being switched, this command needs no additional locks. All the table's other pages and indexes are fully available for your applications to use during the defragmentation process. If you must completely rebuild an index because you want a new fillfactor, or if simple defragmentation is not enough because you want to remove all fragmentation from your indexes, another SQL Server 2000 improvement makes index rebuilding less of an imposition on the rest of the system. SQL Server 2000 lets you create an index in parallel—that is, using multiple processors—which drastically reduces the time necessary to perform the rebuild. The algorithm SQL Server 2000 uses, allows near-linear scaling with the number of processors you use for the rebuild, so four processors will take only one-fourth the time that one processor requires to rebuild an index. System availability increases because the length of time that a table is unavailable decreases. Note that only the SQL Server 2000 Enterprise Edition supports parallel index creation. Indexes on Views and Computed Columns Building an Index Gives the Data Physical Existence Normally, views are only logical and the rows comprising the view’s data are not generated until the view is accessed. The values for computed columns are typically not stored anywhere in the database; only the definition for the computation is stored and the computation is redone every time a computed column is accessed. The first index on a view must be a clustered index, so that the leaf level can hold all the actual rows that make up the view. Once that clustered index has been build, and the view’s data is now physical, additional (nonclustered) indexes can be built. An index on a computed column can be nonclustered, because all we need to store is the index key values. Common Prerequisites for Indexed Views and Indexes on Computed Columns In order for SQL Server to create use these special indexes, you must have the seven SET options correctly specified: ARITHABORT, CONCAT_NULL_YIELDS_NULL, QUOTED_IDENTIFIER, ANSI_NULLS, ANSI_PADDING, ANSI_WARNING must be all ON NUMERIC_ROUNDABORT must be OFF Only deterministic expressions can be used in the definition of Indexed Views or indexes on Computed Columns. See the BOL for the list of deterministic functions and expressions. Property functions are available to check if a column or view meets the requirements and is indexable. SELECT OBJECTPROPERTY (Object_id, ‘IsIndexable’) SELECT COLUMNPROPERTY (Object_id, column_name , ‘IsIndexable’ ) Schema Binding Guarantees That Object Definition Won’t Change A view can only be indexed if it has been built with schema binding. The SQL Server Optimizer Determines If the Indexed View Can Be Used The query must request a subset of the data contained in the view. The ability of the optimizer to use the indexed view even if the view is not directly referenced is available only in SQL Server 2000 Enterprise Edition. In Standard edition, you can create indexed views, and you can select directly from them, but the optimizer will not choose to use them if they are not directly referenced. Examples of Indexed Views: The best candidates for improvement by indexed views are queries performing aggregations and joins. We will explain how the useful indexed views may be created for these two major groups of queries. The considerations are valid also for queries and indexed views using both joins and aggregations. -- Example: USE Northwind -- Identify 5 products with overall biggest discount total. -- This may be expressed for example by two different queries: -- Q1. select TOP 5 ProductID, SUM(UnitPrice*Quantity)- SUM(UnitPrice*Quantity*(1.00-Discount)) Rebate from [order details] group by ProductID order by Rebate desc --Q2. select TOP 5 ProductID, SUM(UnitPrice*Quantity*Discount) Rebate from [order details] group by ProductID order by Rebate desc --The following indexed view will be used to execute Q1. create view Vdiscount1 with schemabinding as select SUM(UnitPrice*Quantity) SumPrice, SUM(UnitPrice*Quantity*(1.00-Discount)) SumDiscountPrice, COUNT_BIG(*) Count, ProductID from dbo.[order details] group By ProductID create unique clustered index VDiscountInd on Vdiscount1 (ProductID) However, it will not be used by the Q2 because the indexed view does not contain the SUM(UnitPrice*Quantity*Discount) aggregate. We can construct another indexed view create view Vdiscount2 with schemabinding as select SUM(UnitPrice*Quantity) SumPrice, SUM(UnitPrice*Quantity*(1.00-Discount)) SumDiscountPrice, SUM(UnitPrice*Quantity*Discount) SumDiscoutPrice2, COUNT_BIG(*) Count, ProductID from dbo.[order details] group By ProductID create unique clustered index VDiscountInd on Vdiscount2 (ProductID) This view may be used by both Q1 and Q2. Observe that the indexed view Vdiscount2 will have the same number of rows and only one more column compared to Vdiscount1, and it may be used by more queries. In general, try to design indexed views that may be used by more queries. The following query asking for the order with the largest total discount -- Q3. select TOP 3 OrderID, SUM(UnitPrice*Quantity*Discount) OrderRebate from dbo.[order details] group By OrderID Q3 can use neither of the Vdiscount views because the column OrderID is not included in the view definition. To address this variation of the discount analysis query we may create a different indexed view, similar to the query itself. An attempt to generalize the previous indexed view Vdiscount2 so that all three queries Q1, Q2, and Q3 can take advantage of a single indexed view would require a view with both OrderID and ProductID as grouping columns. Because the OrderID, ProductID combination is unique in the original order details table the resulting view would have as many rows as the original table and we would see no savings in using such view compared to using the original table. Consider the size of the resulting indexed view. In the case of pure aggregation, the indexed view may provide no significant performance gains if its size is close to the size of the original table. Complex aggregates (STDEV, VARIANCE, AVG) cannot participate in the index view definition. However, SQL Server may use an indexed view to execute a query containing AVG aggregate. Query containing STDEV or VARIANCE cannot use indexed view to pre-compute these values. The next example shows a query producing the average price for a particular product -- Q4. select ProductName, od.ProductID, AVG(od.UnitPrice*(1.00-Discount)) AvgPrice, SUM(od.Quantity) Units from [order details] od, Products p where od.ProductID=p.ProductID group by ProductName, od.ProductID This is an example of indexed view that will be considered by the SQL Server to answer the Q4 create view v3 with schemabinding as select od.ProductID, SUM(od.UnitPrice*(1.00-Discount)) Price, COUNT_BIG(*) Count, SUM(od.Quantity) Units from dbo.[order details] od group by od.ProductID go create UNIQUE CLUSTERED index iv3 on v3 (ProductID) go Observe that the view definition does not contain the table Products. The indexed view does not need to contain all tables used in the query that uses the indexed view. In addition, the following query (same as above Q4 only with one additional search condition) will use the same indexed view. Observe that the added predicate references only columns from tables not present in the v3 view definition. -- Q5. select ProductName, od.ProductID, AVG(od.UnitPrice*(1.00-Discount)) AvgPrice, SUM(od.Quantity) Units from [order details] od, Products p where od.ProductID=p.ProductID and p.ProductName like '%tofu%' group by ProductName, od.ProductID The following query cannot use the indexed view because the added search condition od.UnitPrice>10 contains a column from the table in the view definition and the column is neither grouping column nor the predicate appears in the view definition. -- Q6. select ProductName, od.ProductID, AVG(od.UnitPrice*(1.00-Discount)) AvgPrice, SUM(od.Quantity) Units from [order details] od, Products p where od.ProductID=p.ProductID and od.UnitPrice>10 group by ProductName, od.ProductID To contrast the Q6 case, the following query will use the indexed view v3 since the added predicate is on the grouping column of the view v3. -- Q7. select ProductName, od.ProductID, AVG(od.UnitPrice*(1.00-Discount)) AvgPrice, SUM(od.Quantity) Units from [order details] od, Products p where od.ProductID=p.ProductID and od.ProductID in (1,2,13,41) group by ProductName, od.ProductID -- The previous query Q6 will use the following indexed view V4: create view V4 with schemabinding as select ProductName, od.ProductID, SUM(od.UnitPrice*(1.00-Discount)) AvgPrice, SUM(od.Quantity) Units, COUNT_BIG(*) Count from dbo.[order details] od, dbo.Products p where od.ProductID=p.ProductID and od.UnitPrice>10 group by ProductName, od.ProductID create unique clustered index VDiscountInd on V4 (ProductName, ProductID) The same index on the view V4 will be used also for a query where a join to the table Orders is added, for example -- Q8. select ProductName, od.ProductID, AVG(od.UnitPrice*(1.00-Discount)) AvgPrice, SUM(od.Quantity) Units from dbo.[order details] od, dbo.Products p, dbo.Orders o where od.ProductID=p.ProductID and o.OrderID=od.OrderID and od.UnitPrice>10 group by ProductName, od.ProductID We will show several modifications of the query Q8 and explain why such modifications cannot use the above view V4. -- Q8a. select ProductName, od.ProductID, AVG(od.UnitPrice*(1.00-Discount)) AvgPrice, SUM(od.Quantity) Units from dbo.[order details] od, dbo.Products p, dbo.Orders o where od.ProductID=p.ProductID and o.OrderID=od.OrderID and od.UnitPrice>25 group by ProductName, od.ProductID 8a cannot use the indexed view because of the where clause mismatch. Observe that table Orders does not participate in the indexed view V4 definition. In spite of that, adding a predicate on this table will disallow using the indexed view because the added predicate may eliminate additional rows participating in the aggregates as it is shown in Q8b. -- Q8b. select ProductName, od.ProductID, AVG(od.UnitPrice*(1.00-Discount)) AvgPrice, SUM(od.Quantity) Units from dbo.[order details] od, dbo.Products p, dbo.Orders o where od.ProductID=p.ProductID and o.OrderID=od.OrderID and od.UnitPrice>10 and o.OrderDate>'01/01/1998' group by ProductName, od.ProductID Locking and Indexes In General, You Should Let SQL Server Control the Locking within Indexes The stored procedure sp_indexoption lets you manually control the unit of locking within an index. It also lets you disallow page locks or row locks within an index. Since these options are available only for indexes, there is no way to control the locking within the data pages of a heap. (But remember that if a table has a clustered index, the data pages are part of the index and are affected by the sp_indexoption setting.) The index options are set for each table or index individually. Two options, Allow Rowlocks and AllowPageLocks, are both set to TRUE initially for every table and index. If both of these options are set to FALSE for a table, only full table locks are allowed. As described in Module 4, SQL Server determines at runtime whether to initially lock rows, pages, or the entire table. The locking of rows (or keys) is heavily favored. The type of locking chosen is based on the number of rows and pages to be scanned, the number of rows on a page, the isolation level in effect, the update activity going on, the number of users on the system needing memory for their own purposes, and so on. SAP databases frequently use sp_indexoption to reduce deadlocks Setting vs. Querying In SQL Server 2000, the procedure sp_indexoption should only be used for setting an index option. To query an option, use the INDEXPROPERTY function. Lesson 2: Concepts – Statistics Statistics are the most important tool that the SQL Server query optimizer has to determine the ideal execution plan for a query. Statistics that are out of date or nonexistent seriously jeopardize query performance. SQL Server 2000 computes and stores statistics in a completely different format that all earlier versions of SQL Server. One of the improvements is an increased ability to determine which values are out of the normal range in terms of the number of occurrences. The new statistics maintenance routines are particularly good at determining when a key value has a very unusual skew of data. What You Will Learn After completing this lesson, you will be able to:  Define terms related to statistics collected by SQL Server.  Describe how statistics are maintained by SQL Server.  Discuss the autostats feature of SQL Server.  Describe how statistics are used in query optimization. Recommended Reading  Statistics Used by the Query Optimizer in Microsoft SQL Server 2000 http://msdn.microsoft.com/library/techart/statquery.htm Definitions Cardinality The cardinality means how many unique values exist in the data. Density For each index and set of column statistics, SQL Server keeps track of details about the uniqueness (or density) of the data values encountered, which provides a measure of how selective the index is. A unique index, of course, has the lowest density —by definition, each index entry can point to only one row. A unique index has a density value of 1/number of rows in the table. Density values range from 0 through 1. Highly selective indexes have density values of 0.10 or lower. For example, a unique index on a table with 8345 rows has a density of 0.00012 (1/8345). If a nonunique nonclustered index has a density of 0.2165 on the same table, each index key can be expected to point to about 1807 rows (0.2165 × 8345). This is probably not selective enough to be more efficient than just scanning the table, so this index is probably not useful. Because driving the query from a nonclustered index means that the pages must be retrieved in index order, an estimated 1807 data page accesses (or logical reads) are needed if there is no clustered index on the table and the leaf level of the index contains the actual RID of the desired data row. The only time a data page doesn’t need to be reaccessed is when the occasional coincidence occurs in which two adjacent index entries happen to point to the same data page. In general, you can think of density as the average number of duplicates. We can also talk about the term ‘join density’, which applies to the average number of duplicates in the foreign key column. This would answer the question: in this one-to-many relationship, how many is ‘many’? Selectivity In general selectivity applies to a particular data value referenced in a WHERE clause. High selectivity means that only a small percentage of the rows satisfy the WHERE clause filter, and a low selectivity means that many rows will satisfy the filter. For example, in an employees table, the column employee_id is probably very selective, and the column gender is probably not very selective at all. Statistics Statistics are a histogram consisting of an even sampling of values for a column or for an index key (or the first column of the key for a composite index) based on the current data. The histogram is stored in the statblob field of the sysindexes table, which is of type image. (Remember that image data is actually stored in structures separate from the data row itself. The data row merely contains a pointer to the image data. For simplicity’s sake, we’ll talk about the index statistics as being stored in the image field called statblob.) To fully estimate the usefulness of an index, the optimizer also needs to know the number of pages in the table or index; this information is stored in the dpages column of sysindexes. During the second phase of query optimization, index selection, the query optimizer determines whether an index exists for a columns in your WHERE clause, assesses the index’s usefulness by determining the selectivity of the clause (that is, how many rows will be returned), and estimates the cost of finding the qualifying rows. Statistics for a single column index consist of one histogram and one density value. The multicolumn statistics for one set of columns in a composite index consist of one histogram for the first column in the index and density values for each prefix combination of columns (including the first column alone). The fact that density information is kept for all columns helps the optimizer decide how useful the index is for joins. Suppose, for example, that an index is composed of three key fields. The density on the first column might be 0.50, which is not too useful. However, as you look at more key columns in the index, the number of rows pointed to is fewer than (or in the worst case, the same as) the first column, so the density value goes down. If you are looking at both the first and second columns, the density might be 0.25, which is somewhat better. Moreover, if you examine three columns, the density might be 0.03, which is highly selective. It does not make sense to refer to the density of only the second column. The lead column density is always needed. Statistics Maintenance Statistics Information Tracks the Distribution of Key Values SQL Server statistics is basically a histogram that contains up to 200 values of a given key column. In addition to the histogram, the statblob field contains the following information:  The time of the last statistics collection  The number of rows used to produce the histogram and density information  The average key length  Densities for other combinations of columns In the statblob column, up to 200 sample values are stored; the range of key values between each sample value is called a step. The sample value is the endpoint of the range. Three values are stored along with each step: a value called EQ_ROWS, which is the number of rows that have a value equal to that sample value; a value called RANGE_ROWS, which specifies how many other values are inside the range (between two adjacent sample values); and the number of distinct values, or RANGE_DENSITY of the range. DBCC SHOW_STATISTICS The DBCC SHOW_STATISTICS output shows us the first two of these three values, but not the range density. The RANGE_DENSITY is instead used to compute two additional values:  DISTINCT_RANGE_ROWS—the number of distinct rows inside this range (not counting the RANGE_HI_KEY value itself. This is computed as 1/RANGE_DENSITY.  AVG_RANGE_ROWS—the average number of rows per distinct value, computed as RANGE_DENSITY * RANGE_ROWS. In addition to statistics on indexes, SQL Server can also keep track of statistics on columns with no indexes. Knowing the density, or the likelihood of a particular value occurring, can help the optimizer determine an optimum processing strategy, even if SQL Server can’t use an index to actually locate the values. Statistics on Columns Column statistics can be useful for two main purposes  When the SQL Server optimizer is determining the optimal join order, it frequently is best to have the smaller input processed first. By ‘input’ we mean table after all filters in the WHERE clause have been applied. Even if there is no useful index on a column in the WHERE clause, statistics could tell us that only a few rows will quality, and those the resulting input will be very small.  The SQL Server query optimizer can use column statistics on non-initial columns in a composite nonclustered index to determine if scanning the leaf level to obtain the bookmarks will be an efficient processing strategy. For example, in the member table in the credit database, the first name column is almost unique. Suppose we have a nonclustered index on (lastname, firstname), and we issue this query: select * from member where firstname = 'MPRO' In this case, statistics on the firstname column would indicate very few rows satisfying this condition, so the optimizer will choose to scan the nonclustered index, since it is smaller than the clustered index (the table). The small number of bookmarks will then be followed to retrieve the actual data. Manually Updating Statistics You can also manually force statistics to be updated in one of two ways. You can run the UPDATE STATISTICS command on a table or on one specific index or column statistics, or you can also execute the procedure sp_updatestats, which runs UPDATE STATISTICS against all user-defined tables in the current database. You can create statistics on unindexed columns using the CREATE STATISTICS command or by executing sp_createstats, which creates single-column statistics for all eligible columns for all user tables in the current database. This includes all columns except computed columns and columns of the ntext, text, or image datatypes, and columns that already have statistics or are the first column of an index. Autostats By Default SQL Server Will Update Statistics on Any Index or Column as Needed Every database is created with the database options auto create statistics and auto update statistics set to true, but you can turn either one off. You can also turn off automatic updating of statistics for a specific table in one of two ways:  UPDATE STATISTICS In addition to updating the statistics, the option WITH NORECOMPUTE indicates that the statistics should not be automatically recomputed in the future. Running UPDATE STATISTICS again without the WITH NORECOMPUTE option enables automatic updates.  sp_autostats This procedure sets or unsets a flag for a table to indicate that statistics should or should not be updated automatically. You can also use this procedure with only the table name to find out whether the table is set to automatically have its index statistics updated. ' However, setting the database option auto update statistics to FALSE overrides any individual table settings. In other words, no automatic updating of statistics takes place. This is not a recommended practice unless thorough testing has shown you that you do not need the automatic updates or that the performance overhead is more than you can afford. Trace Flags Trace flag 205 – reports recompile due to autostats. Trace flag 8721 – writes information to the errorlog when AutoStats has been run. For more information, see the following Knowledge Base article: Q195565 “INF: How SQL Server 7.0 Autostats Work.” Statistics and Performance The Performance Penalty of NOT Having Up-To-Date Statistics Far Outweighs the Benefit of Avoiding Automatic Updating Autostats should be turned off only after thorough testing shows it to be necessary. Because autostats only forces a recompile after a certain number or percentage of rows has been changed, you do not have to make any adjustments for a read-only database. Lesson 3: Concepts – Query Optimization What You Will Learn After completing this lesson, you will be able to:  Describe the phases of query optimization.  Discuss how SQL Server estimates the selectivity of indexes and column and how this estimate is used in query optimization. Recommended Reading  Chapter 15: “The Query Processor”, Inside SQL Server 2000 by Kalen Delaney  Chapter 16: “Query Tuning”, Inside SQL Server 2000 by Kalen Delaney  Whitepaper about SQL Server Query Processor Architecture by Hal Berenson and Kalen Delaney http://msdn.microsoft.com/library/backgrnd/html/sqlquerproc.htm Phases of Query Optimization Query Optimization Involves several phases Trivial Plan Optimization Optimization itself goes through several steps. The first step is something called Trivial Plan Optimization. The whole idea of trivial plan optimization is that cost based optimization is a bit expensive to run. The optimizer can try a great many possible variations trying to find the cheapest plan. If SQL Server knows that there is only one really viable plan for a query, it could avoid a lot of work. A prime example is a query that consists of an INSERT with a VALUES clause. There is only one possible plan. Another example is a SELECT where all the columns are in a unique covering index, and that index is the only one that is useable. There is no other index that has that set of columns in it. These two examples are cases where SQL Server should just generate the plan and not try to find something better. The trivial plan optimizer finds the really obvious plans, which are typically very inexpensive. In fact, all the plans that get through the autoparameterization template result in plans that the trivial plan optimizer can find. Between those two mechanisms, the plans that are simple tend to be weeded out earlier in the process and do not pay a lot of the compilation cost. This is a good thing, because the number of potential plans in 7.0 went up astronomically as SQL Server added hash joins, merge joins and index intersections, to its list of processing techniques. Simplification and Statistics Loading If a plan is not found by the trivial plan optimizer, SQL Server can perform some simplifications, usually thought of as syntactic transformations of the query itself, looking for commutative properties and operations that can be rearranged. SQL Server can do constant folding, and other operations that do not require looking at the cost or analyzing what indexes are, but that can result in a more efficient query. SQL Server then loads up the metadata including the statistics information on the indexes, and then the optimizer goes through a series of phases of cost based optimization. Cost Based Optimization Phases The cost based optimizer is designed as a set of transformation rules that try various permutations of indexes and join strategies. Because of the number of potential plans in SQL Server 7.0 and SQL Server 2000, if the optimizer just ran through all the combinations and produced a plan, the optimization process would take a very long time to run. Therefore, optimization is broken up into phases. Each phase is a set of rules. After each phase is run, the cost of any resulting plan is examined, and if SQL Server determines that the plan is cheap enough, that plan is kept and executed. If the plan is not cheap enough, the optimizer runs the next phase, which is another set of rules. In the vast majority of cases, a good plan will be found in the preliminary phases. Typically, if the plan that a query would have had in SQL Server 6.5 is also the optimal plan in SQL Server 7.0 and SQL Server 2000, the plan will tend to be found either by the trivial plan optimizer or by the first phase of the cost based optimizer. The rules were intentionally organized to try to make that be true. The plan will probably consist of using a single index and using nested loops. However, every once in a while, because of lack of statistical information, or some other nuance, the optimizer will have to proceed with the later phases of optimization. Sometimes this is because there is a real possibility that the optimizer could find a better plan. When a plan is found, it becomes the optimizer’s output, and then SQL Server goes through all the caching mechanisms that we have already discussed in Module 5. Full Optimization At some point, the optimizer determines that it has gone through enough preliminary phases, and it reverts to a phase called full optimization. If the optimizer goes through all the preliminary phases, and still has not found a cheap plan, it examines the cost for the plan that it has so far. If the cost is above the threshold, the optimizer goes into a phase called full optimization. This threshold is configurable, as the configuration option ‘cost threshold for parallelism’. The full optimization phase assumes that this plan should be run this in parallel. If the machine is very busy, the plan will end up running it in serial, but the optimizer has a goal to produce a good parallel. If the cost is below the threshold (or a single processor machine), the full optimization phase just uses a brute force method to find a serial plan. Selectivity Estimation Selectivity Is One of The Most Important Pieces of Information One of the most import things the optimizer needs to know is the number of rows from any table that will meet all the conditions in the query. If there are no restrictions on a table, and all the rows will be needed, the optimizer can determine the number of rows from the sysindexes table. This number is not absolutely guaranteed to be accurate, but it is the number the optimizer uses. If there is a filter on the table in a WHERE clause, the optimizer needs statistics information. Indexes automatically maintain statistics, and the optimizer will use these values to determine the usefulness of the index. If there is no index on the column involved in the filter, then column statistics can be used or generated. Optimizing Search Arguments In General, the Filters in the WHERE Clause Determine Which Indexes Will Be Useful If an indexed column is referenced in a Search Argument (SARG), the optimizer will analyze the cost of using that index. A SARG has the form:  column value  value column  Operator must be one of =, >, >= <, <= The value can be a constant, an operation, or a variable. Some functions also will be treated as SARGs. These queries have SARGs, and a nonclustered index on firstname will be used in most cases: select * from member where firstname < 'AKKG' select * from member where firstname = substring('HAAKGALSFJA', 2,5) select * from member where firstname = 'AA' + 'KG' declare @name char(4) set @name = 'AKKG' select * from member where firstname < @name Not all functions can be used in SARGs. select * from charge where charge_amt < 2*2 select * from charge where charge_amt < sqrt(16) Compare these queries to ones using = instead of <. With =, the optimizer can use the density information to come up with a good row estimate, even if it’s not going to actually perform the function’s calculations. A filter with a variable is usually a SARG The issue is, can the optimizer come up with useful costing information? A filter with a variable is not a SARG if the variable is of a different datatype, and the column must be converted to the variable’s datatype For more information, see the following Knowledge Base article: Q198625 Enter Title of KB Article Here Use credit go CREATE TABLE [member2] ( [member_no] [smallint] NOT NULL , [lastname] [shortstring] NOT NULL , [firstname] [shortstring] NOT NULL , [middleinitial] [letter] NULL , [street] [shortstring] NOT NULL , [city] [shortstring] NOT NULL , [state_prov] [statecode] NOT NULL , [country] [countrycode] NOT NULL , [mail_code] [mailcode] NOT NULL ) GO insert into member2 select member_no, lastname, firstname, middleinitial, street, city, state_prov, country, mail_code from member alter table member2 add constraint pk_member2 primary key clustered (lastname, member_no, firstname, country) declare @id int set @id = 47 update member2 set city = city + ' City', state_prov = state_prov + ' State' where lastname = 'Barr' and member_no = @id and firstname = 'URQYJBFVRRPWKVW' and country = 'USA' These queries don’t have SARGs, and a table scan will be done: select * from member where substring(lastname, 1,2) = ‘BA’ Some non-SARGs can be converted select * from member where lastname like ‘ba%’ In some cases, you can rewrite your query to turn a non-SARG into a SARG; for example, you can rewrite the substring query above and the LIKE query that follows it. Join Order and Types of Joins Join Order and Strategy Is Determined By the Optimizer The execution plan output will display the join order from top to bottom; i.e. the table listed on top is the first one accessed in a join. You can override the optimizer’s join order decision in two ways:  OPTION (FORCE ORDER) applies to one query  SET FORCEPLAN ON applies to entire session, until set OFF If either of these options is used, the join order is determined by the order the tables are listed in the query’s FROM clause, and no optimizer on JOIN ORDER is done. Forcing the JOIN order may force a particular join strategy. For example, in most outer join operations, the outer table is processed first, and a nested loops join is done. However, if you force the inner table to be accessed first, a merge join will need to be done. Compare the query plan for this query with and without the FORCE ORDER hint: select * from titles right join publishers on titles.pub_id = publishers.pub_id -- OPTION (FORCE ORDER) Nested Loop Join A nested iteration is when the query optimizer constructs a set of nested loops, and the result set grows as it progresses through the rows. The query optimizer performs the following steps. 1. Finds a row from the first table. 2. Uses that row to scan the next table. 3. Uses the result of the previous table to scan the next table. Evaluating Join Combinations The query optimizer automatically evaluates at least four or more possible join combinations, even if those combinations are not specified in the join predicate. You do not have to add redundant clauses. The query optimizer balances the cost and uses statistics to determine the number of join combinations that it evaluates. Evaluating every possible join combination is inefficient and costly. Evaluating Cost of Query Performance When the query optimizer performs a nested join, you should be aware that certain costs are incurred. Nested loop joins are far superior to both merge joins and hash joins when executing small transactions, such as those affecting only a small set of rows. The query optimizer:  Uses nested loop joins if the outer input is quite small and the inner input is indexed and quite large.  Uses the smaller input as the outer table.  Requires that a useful index exist on the join predicate for the inner table.  Always uses a nested loop join strategy if the join operation uses an operator other than an equality operator. Merge Joins The columns of the join conditions are used as inputs to process a merge join. SQL Server performs the following steps when using a merge join strategy: 1. Gets the first input values from each input set. 2. Compares input values. 3. Performs a merge algorithm. • If the input values are equal, the rows are returned. • If the input values are not equal, the lower value is discarded, and the next input value from that input is used for the next comparison. 4. Repeats the process until all of the rows from one of the input sets have been processed. 5. Evaluates any remaining search conditions in the query and returns only rows that qualify. Note Only one pass per input is done. The merge join operation ends after all of the input values of one input have been evaluated. The remaining values from the other input are not processed. Requires That Joined Columns Are Sorted If you execute a query with join operations, and the joined columns are in sorted order, the query optimizer processes the query by using a merge join strategy. A merge join is very efficient because the columns are already sorted, and it requires fewer page I/O. Evaluates Sorted Values For the query optimizer to use the merge join, the inputs must be sorted. The query optimizer evaluates sorted values in the following order: 1. Uses an existing index tree (most typical). The query optimizer can use the index tree from a clustered index or a covered nonclustered index. 2. Leverages sort operations that the GROUP BY, ORDER BY, and CUBE clauses use. The sorting operation only has to be performed once. 3. Performs its own sort operation in which a SORT operator is displayed when graphically viewing the execution plan. The query optimizer does this very rarely. Performance Considerations Consider the following facts about the query optimizer's use of the merge join:  SQL Server performs a merge join for all types of join operations (except cross join or full join operations), including UNION operations.  A merge join operation may be a one-to-one, one-to-many, or many-to-many operation. If the merge join is a many-to-many operation, SQL Server uses a temporary table to store the rows. If duplicate values from each input exist, one of the inputs rewinds to the start of the duplicates as each duplicate value from the other input is processed.  Query performance for a merge join is very fast, but the cost can be high if the query optimizer must perform its own sort operation. If the data volume is large and the desired data can be obtained presorted from existing Balanced-Tree (B-Tree) indexes, merge join is often the fastest join algorithm.  A merge join is typically used if the two join inputs have a large amount of data and are sorted on their join columns (for example, if the join inputs were obtained by scanning sorted indexes).  Merge join operations can only be performed with an equality operator in the join predicate. Hashing is a strategy for dividing data into equal sets of a manageable size based on a given property or characteristic. The grouped data can then be used to determine whether a particular data item matches an existing value. Note Duplicate data or ranges of data are not useful for hash joins because the data is not organized together or in order. When a Hash Join Is Used The query optimizer uses a hash join option when it estimates that it is more efficient than processing queries by using a nested loop or merge join. It typically uses a hash join when an index does not exist or when existing indexes are not useful. Assigns a Build and Probe Input The query optimizer assigns a build and probe input. If the query optimizer incorrectly assigns the build and probe input (this may occur because of imprecise density estimates), it reverses them dynamically. The ability to change input roles dynamically is called role reversal. Build input consists of the column values from a table with the lowest number of rows. Build input creates a hash table in memory to store these values. The hash bucket is a storage place in the hash table in which each row of the build input is inserted. Rows from one of the join tables are placed into the hash bucket where the hash key value of the row matches the hash key value of the bucket. Hash buckets are stored as a linked list and only contain the columns that are needed for the query. A hash table contains hash buckets. The hash table is created from the build input. Probe input consists of the column values from the table with the most rows. Probe input is what the build input checks to find a match in the hash buckets. Note The query optimizer uses column or index statistics to help determine which input is the smaller of the two. Processing a Hash Join The following list is a simplified description of how the query optimizer processes a hash join. It is not intended to be comprehensive because the algorithm is very complex. SQL Server: 1. Reads the probe input. Each probe input is processed one row at a time. 2. Performs the hash algorithm against each probe input and generates a hash key value. 3. Finds the hash bucket that matches the hash key value. 4. Accesses the hash bucket and looks for the matching row. 5. Returns the row if a match is found. Performance Considerations Consider the following facts about the hash joins that the query optimizer uses:  Similar to merge joins, a hash join is very efficient, because it uses hash buckets, which are like a dynamic index but with less overhead for combining rows.  Hash joins can be performed for all types of join operations (except cross join operations), including UNION and DIFFERENCE operations.  A hash operator can remove duplicates and group data, such as SUM (salary) GROUP BY department. The query optimizer uses only one input for both the build and probe roles.  If join inputs are large and are of similar size, the performance of a hash join operation is similar to a merge join with prior sorting. However, if the size of the join inputs is significantly different, the performance of a hash join is often much faster.  Hash joins can process large, unsorted, non-indexed inputs efficiently. Hash joins are useful in complex queries because the intermediate results: • Are not indexed (unless explicitly saved to disk and then indexed). • Are often not sorted for the next operation in the execution plan.  The query optimizer can identify incorrect estimates and make corrections dynamically to process the query more efficiently.  A hash join reduces the need for database denormalization. Denormalization is typically used to achieve better performance by reducing join operations despite redundancy, such as inconsistent updates. Hash joins give you the option to vertically partition your data as part of your physical database design. Vertical partitioning represents groups of columns from a single table in separate files or indexes. Subquery Performance Joins Are Not Inherently Better Than Subqueries Here is an example showing three different ways to update a table, using a second table for lookup purposes. The first uses a JOIN with the update, the second uses a regular introduced with IN, and the third uses a correlated subquery. All three yield nearly identical performance. Note Note that performance comparisons cannot just be made based on I/Os. With HASHING and MERGING techniques, the number of reads may be the same for two queries, yet one may take a lot longer and use more memory resources. Also, always be sure to monitor statistics time. Suppose you want to add a 5 percent discount to order items in the Order Details table for which the supplier is Exotic Liquids, whose supplierid is 1. -- JOIN solution BEGIN TRAN UPDATE OD SET discount = discount + 0.05 FROM [Order Details] AS OD JOIN Products AS P ON OD.productid = P.productid WHERE supplierid = 1 ROLLBACK TRAN -- Regular subquery solution BEGIN TRAN UPDATE [Order Details] SET discount = discount + 0.05 WHERE productid IN (SELECT productid FROM Products WHERE supplierid = 1) ROLLBACK TRAN -- Correlated Subquery Solution BEGIN TRAN UPDATE [Order Details] SET discount = discount + 0.05 WHERE EXISTS(SELECT supplierid FROM Products WHERE [Order Details].productid = Products.productid AND supplierid = 1) ROLLBACK TRAN Internally, Your Join May Be Rewritten SQL Server’s query processor had many different ways of resolving your JOIN expressions. Subqueries may be converted to a JOIN with an implied distinct, which may result in a logical operator of SEMI JOIN. Compare the plans of the first two queries: USE credit select member_no from member where member_no in (select member_no from charge) select distinct m.member_no from member m join charge c on m.member_no = c.member_no The second query uses a HASH MATCH as the final step to remove the duplicates. The first query only had to do a semi join. For these queries, although the I/O values are the same, the first query (with the subquery) runs much faster (almost twice as fast). Another similar looking join is

Java8新特性及实战视频教程完整版Java 8 API添加了一个新的抽象称为流Stream，可以让你以一种声明的方式处理数据。Stream 使用一种类似用 SQL 语句从数据库查询数据的直观方式来提供一种对 Java 集合运算和表达的高阶抽象。Stream API可以极大提高Java程序员的生产力，让程序员写出高效率、干净、简洁的代码。这种风格将要处理的元素集合看作一种流，流在管道中传输，并且可以在管道的节点上进行处理，比如筛选，排序，聚合等。元素流在管道中经过中间操作（intermediate operation）的处理，最后由最终操作(terminal operation)得到前面处理的结果。 Lambda 表达式，也可称为闭包，它是推动 Java 8 发布的最重要新特性。Lambda 允许把函数作为一个方法的参数（函数作为参数传递进方法中）。使用Lambda 表达式可以使代码变的更加简洁紧凑。Java8实战视频-01让方法参数具备行为能力Java8实战视频-02Lambda表达式初探Java8实战视频-03Lambda语法精讲Java8实战视频-04Lambda使用深入解析Java8实战视频-05Lambda方法推导详细解析-上.wmvJava8实战视频-06Lambda方法推导详细解析-下Java8实战视频-07Stream入门及Stream在JVM中的线程表现Java8实战视频-08Stream知识点总结Stream源码阅读Java8实战视频-09如何创建Stream上集Java8实战视频-10如何创建Stream下集.wmvJava8实战视频-11Stream之filter，distinct，skip，limit，map，flatmap详细介绍Java8实战视频-12Stream之Find，Match，Reduce详细介绍Java8实战视频-13NumericStream的详细介绍以及和Stream之间的相互转换Java8实战视频-14Stream综合练习，熟练掌握API的用法Java8实战视频-15在Optional出现之前经常遇到的空指针异常.wmvJava8实战视频-16Optional的介绍以及API的详解Java8实战视频-17Optional之flatMap，综合练习，Optional源码剖析Java8实战视频-18初识Collector体会Collector的强大Java8实战视频-19Collector使用方法深入详细介绍-01Java8实战视频-20Collector使用方法深入详细介绍-02Java8实战视频-21Collector使用方法深入详细介绍-03.wmvJava8实战视频-22Collector使用方法深入详细介绍-04Java8实战视频-23Collector原理讲解，JDK自带Collector源码深度剖析Java8实战视频-24自定义Collector，结合Stream的使用详细介绍Java8实战视频-25Parallel Stream编程体验，充分利用多核机器加快计算速度Java8实战视频-26Fork Join框架实例深入讲解Java8实战视频-27Spliterator接口源码剖析以及自定义Spliterator实现一个Stream.wmvJava8实战视频-28Default方法的介绍和简单的例子Java8实战视频-29Default方法解决多重继承冲突的三大原则详细介绍Java8实战视频-30多线程Future设计模式原理详细介绍，并且实现一个Future程序Java8实战视频-31JDK自带Future,Callable,ExecutorService介绍Java8实战视频-32实现一个异步基于事件回调的Future程序.wmvJava8实战视频-33CompletableFuture用法入门介绍Java8实战视频-34CompletableFuture之supplyAsync详细介绍Java8实战视频-35CompletableFuture流水线工作，join多个异步任务详细讲解Java8实战视频-36CompletableFuture常用API的重点详解-上Java8实战视频-37CompletableFuture常用API的重点详解-下Java8实战视频-38JDK老DateAPI存在的问题，新的DateAPI之LocalDate用法及其介绍.wmvJava8实战视频-39New Date API之LocalTime，LocalDateTime，Instant，Duration，Period详细介绍Java8实战视频-40New Date API之format和parse介绍

Contents Module Overview 1 Lesson 1: Memory 3 Lesson 2: I/O 73 Lesson 3: CPU 111 Module 3: Troubleshooting Server Performance Module Overview Troubleshooting server performance-based support calls requires product knowledge, good communication skills, and a proven troubleshooting methodology. In this module we will discuss Microsoft® SQL Server™ interaction with the operating system and methodology of troubleshooting server-based problems. At the end of this module, you will be able to:  Define the common terms associated the memory, I/O, and CPU subsystems.  Describe how SQL Server leverages the Microsoft Windows® operating system facilities including memory, I/O, and threading.  Define common SQL Server memory, I/O, and processor terms.  Generate a hypothesis based on performance counters captured by System Monitor.  For each hypothesis generated, identify at least two other non-System Monitor pieces of information that would help to confirm or reject your hypothesis.  Identify at least five counters for each subsystem that are key to understanding the performance of that subsystem.  Identify three common myths associated with the memory, I/O, or CPU subsystems. Lesson 1: Memory What You Will Learn After completing this lesson, you will be able to:  Define common terms used when describing memory.  Give examples of each memory concept and how it applies to SQL Server.  Describe how SQL Server user and manages its memory.  List the primary configuration options that affect memory.  Describe how configuration options affect memory usage.  Describe the effect on the I/O subsystem when memory runs low.  List at least two memory myths and why they are not true. Recommended Reading  SQL Server 7.0 Performance Tuning Technical Reference, Microsoft Press  Windows 2000 Resource Kit companion CD-ROM documentation. Chapter 15: Overview of Performance Monitoring  Inside Microsoft Windows 2000, Third Edition, David A. Solomon and Mark E. Russinovich  Windows 2000 Server Operations Guide, Storage, File Systems, and Printing; Chapters: Evaluating Memory and Cache Usage  Advanced Windows, 4th Edition, Jeffrey Richter, Microsoft Press Related Web Sites  http://ntperformance/ Memory Definitions Memory Definitions Before we look at how SQL Server uses and manages its memory, we need to ensure a full understanding of the more common memory related terms. The following definitions will help you understand how SQL Server interacts with the operating system when allocating and using memory. Virtual Address Space A set of memory addresses that are mapped to physical memory addresses by the system. In a 32-bit operation system, there is normally a linear array of 2^32 addresses representing 4,294,967,269 byte addresses. Physical Memory A series of physical locations, with unique addresses, that can be used to store instructions or data. AWE – Address Windowing Extensions A 32-bit process is normally limited to addressing 2 gigabytes (GB) of memory, or 3 GB if the system was booted using the /3G boot switch even if there is more physical memory available. By leveraging the Address Windowing Extensions API, an application can create a fixed-size window into the additional physical memory. This allows a process to access any portion of the physical memory by mapping it into the applications window. When used in combination with Intel’s Physical Addressing Extensions (PAE) on Windows 2000, an AWE enabled application can support up to 64 GB of memory Reserved Memory Pages in a processes address space are free, reserved or committed. Reserving memory address space is a way to reserve a range of virtual addresses for later use. If you attempt to access a reserved address that has not yet been committed (backed by memory or disk) you will cause an access violation. Committed Memory Committed pages are those pages that when accessed in the end translate to pages in memory. Those pages may however have to be faulted in from a page file or memory mapped file. Backing Store Backing store is the physical representation of a memory address. Page Fault (Soft/Hard) A reference to an invalid page (a page that is not in your working set) is referred to as a page fault. Assuming the page reference does not result in an access violation, a page fault can be either hard or soft. A hard page fault results in a read from disk, either a page file or memory-mapped file. A soft page fault is resolved from one of the modified, standby, free or zero page transition lists. Paging is represented by a number of counters including page faults/sec, page input/sec and page output/sec. Page faults/sec include soft and hard page faults where as the page input/output counters represent hard page faults. Unfortunately, all of these counters include file system cache activity. For more information, see also…Inside Windows 2000,Third Edition, pp. 443-451. Private Bytes Private non-shared committed address space Working Set The subset of processes virtual pages that is resident in physical memory. For more information, see also… Inside Windows 2000,Third Edition, p. 455. System Working Set Like a process, the system has a working set. Five different types of pages represent the system’s working set: system cache; paged pool; pageable code and data in the kernel; page-able code and data in device drivers; and system mapped views. The system working set is represented by the counter Memory: cache bytes. System working set paging activity can be viewed by monitoring the Memory: Cache Faults/sec counter. For more information, see also… Inside Windows 2000,Third Edition, p. 463. System Cache The Windows 2000 cache manager provides data caching for both local and network file system drivers. By caching virtual blocks, the cache manager can reduce disk I/O and provide intelligent read ahead. Represented by Memory:Cache Resident bytes. For more information, see also… Inside Windows 2000,Third Edition, pp. 654-659. Non Paged Pool Range of addresses guaranteed to be resident in physical memory. As such, non-paged pool can be accessed at any time without incurring a page fault. Because device drivers operate at DPC/dispatch level (covered in lesson 2), and page faults are not allowed at this level or above, most device drivers use non-paged pool to assure that they do not incur a page fault. Represented by Memory: Pool Nonpaged Bytes, typically between 3-30 megabytes (MB) in size. Note The pool is, in effect, a common area of memory shared by all processes. One of the most common uses of non-paged pool is the storage of object handles. For more information regarding “maximums,” see also… Inside Windows 2000,Third Edition, pp. 403-404 Paged Pool Range of address that can be paged in and out of physical memory. Typically used by drivers who need memory but do not need to access that memory from DPC/dispatch of above interrupt level. Represented by Memory: Pool Paged Bytes and Memory:Pool Paged Resident Bytes. Typically between 10-30MB + size of Registry. For more information regarding “limits,” see also… Inside Windows 2000,Third Edition, pp. 403-404. Stack Each thread has two stacks, one for kernel mode and one for user mode. A stack is an area of memory in which program procedure or function call addresses and parameters are temporarily stored. In Process To run in the same address space. In-process servers are loaded in the client’s address space because they are implemented as DLLs. The main advantage of running in-process is that the system usually does not need to perform a context switch. The disadvantage to running in-process is that DLL has access to the process address space and can potentially cause problems. Out of Process To run outside the calling processes address space. OLEDB providers can run in-process or out of process. When running out of process, they run under the context of DLLHOST.EXE. Memory Leak To reserve or commit memory and unintentionally not release it when it is no longer being used. A process can leak resources such as process memory, pool memory, user and GDI objects, handles, threads, and so on. Memory Concepts (X86 Address Space) Per Process Address Space Every process has its own private virtual address space. For 32-bit processes, that address space is 4 GB, based on a 32-bit pointer. Each process’s virtual address space is split into user and system partitions based on the underlying operating system. The diagram included at the top represents the address partitioning for the 32-bit version of Windows 2000. Typically, the process address space is evenly divided into two 2-GB regions. Each process has access to 2 GB of the 4 GB address space. The upper 2 GB of address space is reserved for the system. The user address space is where application code, global variables, per-thread stacks, and DLL code would reside. The system address space is where the kernel, executive, HAL, boot drivers, page tables, pool, and system cache reside. For specific information regarding address space layout, refer to Inside Microsoft Windows 2000 Third Edition pages 417-428 by Microsoft Press. Access Modes Each virtual memory address is tagged as to what access mode the processor must be running in. System space can only be accessed while in kernel mode, while user space is accessible in user mode. This protects system space from being tampered with by user mode code. Shared System Space Although every process has its own private memory space, kernel mode code and drivers share system space. Windows 2000 does not provide any protection to private memory being use by components running in kernel mode. As such, it is very important to ensure components running in kernel mode are thoroughly tested. 3-GB Address Space 3-GB Address Space Although 2 GB of address space may seem like a large amount of memory, application such as SQL Server could leverage more memory if it were available. The boot.ini option /3GB was created for those cases where systems actually support greater than 2 GB of physical memory and an application can make use of it This capability allows memory intensive applications running on Windows 2000 Advanced Server to use up to 50 percent more virtual memory on Intel-based computers. Application memory tuning provides more of the computer's virtual memory to applications by providing less virtual memory to the operating system. Although a system having less than 2 GB of physical memory can be booted using the /3G switch, in most cases this is ill-advised. If you restart with the 3 GB switch, also known as 4-Gig Tuning, the amount of non-paged pool is reduced to 128 MB from 256 MB. For a process to access 3 GB of address space, the executable image must have been linked with the /LARGEADDRESSAWARE flag or modified using Imagecfg.exe. It should be pointed out that SQL Server was linked using the /LAREGEADDRESSAWARE flag and can leverage 3 GB when enabled. Note Even though you can boot Windows 2000 Professional or Windows 2000 Server with the /3GB boot option, users processes are still limited to 2 GB of address space even if the IMAGE_FILE_LARGE_ADDRESS_AWARE flag is set in the image. The only thing accomplished by using the /3G option on these system is the reduction in the amount of address space available to the system (ISW2K Pg. 418). Important If you use /3GB in conjunction with AWE/PAE you are limited to 16 GB of memory. For more information, see the following Knowledge Base articles: Q171793 Information on Application Use of 4GT RAM Tuning Q126402 PagedPoolSize and NonPagedPoolSize Values in Windows NT Q247904 How to Configure Paged Pool and System PTE Memory Areas Q274598 W2K Does Not Enable Complete Memory Dumps Between 2 & 4 GB AWE Memory Layout AWE Memory Usually, the operation system is limited to 4 GB of physical memory. However, by leveraging PAE, Windows 2000 Advanced Server can support up to 8 GB of memory, and Data Center 64 GB of memory. However, as stated previously, each 32-bit process normally has access to only 2 GB of address space, or 3 GB if the system was booted with the /3-GB option. To allow processes to allocate more physical memory than can be represented in the 2GB of address space, Microsoft created the Address Windows Extensions (AWE). These extensions allow for the allocation and use of up to the amount of physical memory supported by the operating system. By leveraging the Address Windowing Extensions API, an application can create a fixed-size window into the physical memory. This allows a process to access any portion of the physical memory by mapping regions of physical memory in and out of the applications window. The allocation and use of AWE memory is accomplished by  Creating a window via VirtualAlloc using the MEM_PHYSICAL option  Allocating the physical pages through AllocateUserPhysicalPages  Mapping the RAM pages to the window using MapUserPhysicalPages Note SQL Server 7.0 supports a feature called extended memory in Windows NT® 4 Enterprise Edition by using a PSE36 driver. Currently there are no PSE drivers for Windows 2000. The preferred method of accessing extended memory is via the Physical Addressing Extensions using AWE. The AWE mapping feature is much more efficient than the older process of coping buffers from extended memory into the process address space. Unfortunately, SQL Server 7.0 cannot leverage PAE/AWE. Because there are currently no PSE36 drivers for Windows 2000 this means SQL Server 7.0 cannot support more than 3GB of memory on Windows 2000. Refer to KB article Q278466. AWE restrictions  The process must have Lock Pages In Memory user rights to use AWE Important It is important that you use Enterprise Manager or DMO to change the service account. Enterprise Manager and DMO will grant all of the privileges and Registry and file permissions needed for SQL Server. The Service Control Panel does NOT grant all the rights or permissions needed to run SQL Server.  Pages are not shareable or page-able  Page protection is limited to read/write  The same physical page cannot be mapped into two separate AWE regions, even within the same process.  The use of AWE/PAE in conjunction with /3GB will limit the maximum amount of supported memory to between 12-16 GB of memory.  Task manager does not show the correct amount of memory allocated to AWE-enabled applications. You must use Memory Manager: Total Server Memory. It should, however, be noted that this only shows memory in use by the buffer pool.  Machines that have PAE enabled will not dump user mode memory. If an event occurs in User Mode Memory that causes a blue screen and root cause determination is absolutely necessary, the machine must be booted with the /NOPAE switch, and with /MAXMEM set to a number appropriate for transferring dump files.  With AWE enabled, SQL Server will, by default, allocate almost all memory during startup, leaving 256 MB or less free. This memory is locked and cannot be paged out. Consuming all available memory may prevent other applications or SQL Server instances from starting. Note PAE is not required to leverage AWE. However, if you have more than 4GB of physical memory you will not be able to access it unless you enable PAE. Caution It is highly recommended that you use the “max server memory” option in combination with “awe enabled” to ensure some memory headroom exists for other applications or instances of SQL Server, because AWE memory cannot be shared or paged. For more information, see the following Knowledge Base articles: Q268363 Intel Physical Addressing Extensions (PAE) in Windows 2000 Q241046 Cannot Create a dump File on Computers with over 4 GB RAM Q255600 Windows 2000 utilities do not display physical memory above 4GB Q274750 How to configure SQL Server memory more than 2 GB (Idea) Q266251 Memory dump stalls when PAE option is enabled (Idea) Tip The KB will return more hits if you query on PAE rather than AWE. Virtual Address Space Mapping Virtual Address Space Mapping By default Windows 2000 (on an X86 platform) uses a two-level (three-level when PAE is enabled) page table structure to translate virtual addresses to physical addresses. Each 32-bit address has three components, as shown below. When a process accesses a virtual address the system must first locate the Page Directory for the current process via register CR3 (X86). The first 10 bits of the virtual address act as an index into the Page Directory. The Page Directory Entry then points to the Page Frame Number (PFN) of the appropriate Page Table. The next 10 bits of the virtual address act as an index into the Page Table to locate the appropriate page. If the page is valid, the PTE contains the PFN of the actual page in memory. If the page is not valid, the memory management fault handler locates the page and attempts to make it valid. The final 12 bits act as a byte offset into the page. Note This multi-step process is expensive. This is why systems have translation look aside buffers (TLB) to speed up the process. One of the reasons context switching is so expensive is the translation buffers must be dumped. Thus, the first few lookups are very expensive. Refer to ISW2K pages 439-440. Core System Memory Related Counters Core System Memory Related Counters When evaluating memory performance you are looking at a wide variety of counters. The counters listed here are a few of the core counters that give you quick overall view of the state of memory. The two key counters are Available Bytes and Committed Bytes. If Committed Bytes exceeds the amount of physical memory in the system, you can be assured that there is some level of hard page fault activity happening. The goal of a well-tuned system is to have as little hard paging as possible. If Available Bytes is below 5 MB, you should investigate why. If Available Bytes is below 4 MB, the Working Set Manager will start to aggressively trim the working sets of process including the system cache.  Committed Bytes Total memory, including physical and page file currently committed  Commit Limit • Physical memory + page file size • Represents the total amount of memory that can be committed without expanding the page file. (Assuming page file is allowed to grow)  Available Bytes Total physical memory currently available Note Available Bytes is a key indicator of the amount of memory pressure. Windows 2000 will attempt to keep this above approximately 4 MB by aggressively trimming the working sets including system cache. If this value is constantly between 3-4 MB, it is cause for investigation. One counter you might expect would be for total physical memory. Unfortunately, there is no specific counter for total physical memory. There are however many other ways to determine total physical memory. One of the most common is by viewing the Performance tab of Task Manager. Page File Usage The only counters that show current page file space usage are Page File:% Usage and Page File:% Peak Usage. These two counters will give you an indication of the amount of space currently used in the page file. Memory Performance Memory Counters There are a number of counters that you need to investigate when evaluating memory performance. As stated previously, no single counter provides the entire picture. You will need to consider many different counters to begin to understand the true state of memory. Note The counters listed are a subset of the counters you should capture. *Available Bytes In general, it is desirable to see Available Bytes above 5 MB. SQL Servers goal on Intel platforms, running Windows NT, is to assure there is approximately 5+ MB of free memory. After Available Bytes reaches 4 MB, the Working Set Manager will start to aggressively trim the working sets of process and, finally, the system cache. This is not to say that working set trimming does not happen before 4 MB, but it does become more pronounced as the number of available bytes decreases below 4 MB. Page Faults/sec Page Faults/sec represents the total number of hard and soft page faults. This value includes the System Working Set as well. Keep this in mind when evaluating the amount of paging activity in the system. Because this counter includes paging associated with the System Cache, a server acting as a file server may have a much higher value than a dedicated SQL Server may have. The System Working Set is covered in depth on the next slide. Because Page Faults/sec includes soft faults, this counter is not as useful as Pages/sec, which represents hard page faults. Because of the associated I/O, hard page faults tend to be much more expensive. *Pages/sec Pages/sec represent the number of pages written/read from disk because of hard page faults. It is the sum of Memory: Pages Input/sec and Memory: Pages Output/sec. Because it is counted in numbers of pages, it can be compared to other counts of pages, such as Memory: Page Faults/sec, without conversion. On a well-tuned system, this value should be consistently low. In and of itself, a high value for this counter does not necessarily indicate a problem. You will need to isolate the paging activity to determine if it is associated with in-paging, out-paging, memory mapped file activity or system cache. Any one of these activities will contribute to this counter. Note Paging in and of itself is not necessarily a bad thing. Paging is only “bad” when a critical process must wait for it’s pages to be in-paged, or when the amount of read/write paging is causing excessive kernel time or disk I/O, thus interfering with normal user mode processing. Tip (Memory: Pages/sec) / (PhysicalDisk: Disk Bytes/sec * 4096) yields the approximate percentage of paging to total disk I/O. Note, this is only relevant on X86 platforms with a 4 KB page size. Page Reads/sec (Hard Page Fault) Page Reads/sec is the number of times the disk was accessed to resolve hard page faults. It includes reads to satisfy faults in the file system cache (usually requested by applications) and in non-cached memory mapped files. This counter counts numbers of read operations, without regard to the numbers of pages retrieved by each operation. This counter displays the difference between the values observed in the last two samples, divided by the duration of the sample interval. Page Writes/sec (Hard Page Fault) Page Writes/sec is the number of times pages were written to disk to free up space in physical memory. Pages are written to disk only if they are changed while in physical memory, so they are likely to hold data, not code. This counter counts write operations, without regard to the number of pages written in each operation. This counter displays the difference between the values observed in the last two samples, divided by the duration of the sample interval. *Pages Input/sec (Hard Page Fault) Pages Input/sec is the number of pages read from disk to resolve hard page faults. It includes pages retrieved to satisfy faults in the file system cache and in non-cached memory mapped files. This counter counts numbers of pages, and can be compared to other counts of pages, such as Memory:Page Faults/sec, without conversion. This counter displays the difference between the values observed in the last two samples, divided by the duration of the sample interval. This is one of the key counters to monitor for potential performance complaints. Because a process must wait for a read page fault this counter, read page faults have a direct impact on the perceived performance of a process. *Pages Output/sec (Hard Page Fault) Pages Output/sec is the number of pages written to disk to free up space in physical memory. Pages are written back to disk only if they are changed in physical memory, so they are likely to hold data, not code. A high rate of pages output might indicate a memory shortage. Windows NT writes more pages back to disk to free up space when physical memory is in short supply. This counter counts numbers of pages, and can be compared to other counts of pages, without conversion. This counter displays the difference between the values observed in the last two samples, divided by the duration of the sample interval. Like Pages Input/sec, this is one of the key counters to monitor. Processes will generally not notice write page faults unless the disk I/O begins to interfere with normal data operations. Demand Zero Faults/Sec (Soft Page Fault) Demand Zero Faults/sec is the number of page faults that require a zeroed page to satisfy the fault. Zeroed pages, pages emptied of previously stored data and filled with zeros, are a security feature of Windows NT. Windows NT maintains a list of zeroed pages to accelerate this process. This counter counts numbers of faults, without regard to the numbers of pages retrieved to satisfy the fault. This counter displays the difference between the values observed in the last two samples, divided by the duration of the sample interval. Transition Faults/Sec (Soft Page Fault) Transition Faults/sec is the number of page faults resolved by recovering pages that were on the modified page list, on the standby list, or being written to disk at the time of the page fault. The pages were recovered without additional disk activity. Transition faults are counted in numbers of faults, without regard for the number of pages faulted in each operation. This counter displays the difference between the values observed in the last two samples, divided by the duration of the sample interval. System Working Set System Working Set Like processes, the system page-able code and data are managed by a working set. For the purpose of this course, that working set is referred to as the System Working Set. This is done to differentiate the system cache portion of the working set from the entire working set. There are five different types of pages that make up the System Working Set. They are: system cache; paged pool; page-able code and data in ntoskrnl.exe; page-able code, and data in device drivers and system-mapped views. Unfortunately, some of the counters that appear to represent the system cache actually represent the entire system working set. Where noted system cache actually represents the entire system working set. Note The counters listed are a subset of the counters you should capture. *Memory: Cache Bytes (Represents Total System Working Set) Represents the total size of the System Working Set including: system cache; paged pool; pageable code and data in ntoskrnl.exe; pageable code and data in device drivers; and system-mapped views. Cache Bytes is the sum of the following counters: System Cache Resident Bytes, System Driver Resident Bytes, System Code Resident Bytes, and Pool Paged Resident Bytes. Memory: System Cache Resident Bytes (System Cache) System Cache Resident Bytes is the number of bytes from the file system cache that are resident in physical memory. Windows 2000 Cache Manager works with the memory manager to provide virtual block stream and file data caching. For more information, see also…Inside Windows 2000,Third Edition, pp. 645-650 and p. 656. Memory: Pool Paged Resident Bytes Represents the physical memory consumed by Paged Pool. This counter should NOT be monitored by itself. You must also monitor Memory: Paged Pool. A leak in the pool may not show up in Pool paged Resident Bytes. Memory: System Driver Resident Bytes Represents the physical memory consumed by driver code and data. System Driver Resident Bytes and System Driver Total Bytes do not include code that must remain in physical memory and cannot be written to disk. Memory: System Code Resident Bytes Represents the physical memory consumed by page-able system code. System Code Resident Bytes and System Code Total Bytes do not include code that must remain in physical memory and cannot be written to disk. Working Set Performance Counter You can measure the number of page faults in the System Working Set by monitoring the Memory: Cache Faults/sec counter. Contrary to the “Explain” shown in System Monitor, this counter measures the total amount of page faults/sec in the System Working Set, not only the System Cache. You cannot measure the performance of the System Cache using this counter alone. For more information, see also…Inside Windows 2000,Third Edition, p. 656. Note You will find that in general the working set manager will usually trim the working sets of normal processes prior to trimming the system working set. System Cache System Cache The Windows 2000 cache manager provides a write-back cache with lazy writing and intelligent read-ahead. Files are not written to disk immediately but differed until the cache manager calls the memory manager to flush the cache. This helps to reduce the total number of I/Os. Once per second, the lazy writer thread queues one-eighth of the dirty pages in the system cache to be written to disk. If this is not sufficient to meet the needs, the lazy writer will calculate a larger value. If the dirty page threshold is exceeded prior to lazy writer waking, the cache manager will wake the lazy writer. Important It should be pointed out that mapped files or files opened with FILE_FLAG_NO_BUFFERING, do not participate in the System Cache. For more information regarding mapped views, see also…Inside Windows 2000,Third Edition, p. 669. For those applications that would like to leverage system cache but cannot tolerate write delays, the cache manager supports write through operations via the FILE_FLAG_WRITE_THROUGH. On the other hand, an application can disable lazy writing by using the FILE_ATTRIBUTE_TEMPORARY. If this flag is enabled, the lazy writer will not write the pages to disk unless there is a shortage of memory or the file is closed. Important Microsoft SQL Server uses both FILE_FLAG_NO_BUFFERING and FILE_FLAG_WRITE_THROUGH Tip The file system cache is not represented by a static amount of memory. The system cache can and will grow. It is not unusual to see the system cache consume a large amount of memory. Like other working sets, it is trimmed under pressure but is generally the last thing to be trimmed. System Cache Performance Counters The counters listed are a subset of the counters you should capture. Cache: Data Flushes/sec Data Flushes/sec is the rate at which the file system cache has flushed its contents to disk as the result of a request to flush or to satisfy a write-through file write request. More than one page can be transferred on each flush operation. Cache: Data Flush Pages/sec Data Flush Pages/sec is the number of pages the file system cache has flushed to disk as a result of a request to flush or to satisfy a write-through file write request. Cache: Lazy Write Flushes/sec Represents the rate of lazy writes to flush the system cache per second. More than one page can be transferred per second. Cache: Lazy Write Pages/sec Lazy Write Pages/sec is the rate at which the Lazy Writer thread has written to disk. Note When looking at Memory:Cache Faults/sec, you can remove cache write activity by subtracting (Cache: Data Flush Pages/sec + Cache: Lazy Write Pages/sec). This will give you a better idea of how much other page faulting activity is associated with the other components of the System Working Set. However, you should note that there is no easy way to remove the page faults associated with file cache read activity. For more information, see the following Knowledge Base articles: Q145952 (NT4) Event ID 26 Appears If Large File Transfer Fails Q163401 (NT4) How to Disable Network Redirector File Caching Q181073 (SQL 6.5) DUMP May Cause Access Violation on Win2000 System Pool System Pool As documented earlier, there are two types of shared pool memory: non-paged pool and paged pool. Like private memory, pool memory is susceptible to a leak. Nonpaged Pool Miscellaneous kernel code and structures, and drivers that need working memory while at or above DPC/dispatch level use non-paged pool. The primary counter for non-paged pool is Memory: Pool Nonpaged Bytes. This counter will usually between 3 and 30 MB. Paged Pool Drivers that do not need to access memory above DPC/Dispatch level are one of the primary users of paged pool, however any process can use paged pool by leveraging the ExAllocatePool calls. Paged pool also contains the Registry and file and printing structures. The primary counters for monitoring paged pool is Memory: Pool Paged Bytes. This counter will usually be between 10-30MB plus the size of the Registry. To determine how much of paged pool is currently resident in physical memory, monitor Memory: Pool Paged Resident Bytes. Note The paged and non-paged pools are two of the components of the System Working Set. If a suspected leak is clearly visible in the overview and not associated with a process, then it is most likely a pool leak. If the leak is not associated with SQL Server handles, OLDEB providers, XPROCS or SP_OA calls then most likely this call should be pushed to the Windows NT group. For more information, see the following Knowledge Base articles: Q265028 (MS) Pool Tags Q258793 (MS) How to Find Memory Leaks by Using Pool Bitmap Analysis Q115280 (MS) Finding Windows NT Kernel Mode Memory Leaks Q177415 (MS) How to Use Poolmon to Troubleshoot Kernel Mode Memory Leaks Q126402 PagedPoolSize and NonPagedPoolSize Values in Windows NT Q247904 How to Configure Paged Pool and System PTE Memory Areas Tip To isolate pool leaks you will need to isolate all drivers and third-party processes. This should be done by disabling each service or driver one at a time and monitoring the effect. You can also monitor paged and non-paged pool through poolmon. If pool tagging has been enabled via GFLAGS, you may be able to associate the leak to a particular tag. If you suspect a particular tag, you should involve the platform support group. Process Memory Counters Process _Total Limitations Although the rollup of _Total for Process: Private Bytes, Virtual Bytes, Handles and Threads, represent the key resources being used across all processes, they can be misleading when evaluating a memory leak. This is because a leak in one process may be masked by a decrease in another process. Note The counters listed are a subset of the counters you should capture. Tip When analyzing memory leaks, it is often easier to a build either a separate chart or report showing only one or two key counters for all process. The primary counter used for leak analysis is private bytes, but processes can leak handles and threads just as easily. After a suspect process is located, build a separate chart that includes all the counters for that process. Individual Process Counters When analyzing individual process for memory leaks you should include the counters listed.  Process: % Processor Time  Process: Working Set (includes shared pages)  Process: Virtual Bytes  Process: Private Bytes  Process: Page Faults/sec  Process: Handle Count  Process: Thread Count  Process: Pool Paged Bytes  Process: Pool Nonpaged Bytes Tip WINLOGON, SVCHOST, services, or SPOOLSV are referred to as HELPER processes. They provide core functionality for many operations and as such are often extended by the addition of third-party DLLs. Tlist –s may help identify what services are running under a particular helper. Helper Processes Helper Processes Winlogon, Services, and Spoolsv and Svchost are examples of what are referred to as HELPER processes. They provide core functionality for many operations and, as such, are often extended by the addition of third-party DLLs. Running every service in its own process can waste system resources. Consequently, some services run in their own processes while others share a process with other services. One problem with sharing a process is that a bug in one service may cause the entire process to fail. The resource kit tool, Tlist when used with the –s qualifier can help you identify what services are running in what processes. WINLOGON Used to support GINAs. SPOOLSV SPOOLSV is responsible for printing. You will need to investigate all added printing functionality. Services Service is responsible for system services. Svchost.exe Svchost.exe is a generic host process name for services that are run from dynamic-link libraries (DLLs). There can be multiple instances of Svchost.exe running at the same time. Each Svchost.exe session can contain a grouping of services, so that separate services can be run depending on how and where Svchost.exe is started. This allows for better control and debugging. The Effect of Memory on Other Components Memory Drives Overall Performance Processor, cache, bus speeds, I/O, all of these resources play a roll in overall perceived performance. Without minimizing the impact of these components, it is important to point out that a shortage of memory can often have a larger perceived impact on performance than a shortage of some other resource. On the other hand, an abundance of memory can often be leveraged to mask bottlenecks. For instance, in certain environments, file system cache can significantly reduce the amount of disk I/O, potentially masking a slow I/O subsystem. Effect on I/O I/O can be driven by a number of memory considerations. Page read/faults will cause a read I/O when a page is not in memory. If the modified page list becomes too long the Modified Page Writer and Mapped Page Writer will need to start flushing pages causing disk writes. However, the one event that can have the greatest impact is running low on available memory. In this case, all of the above events will become more pronounced and have a larger impact on disk activity. Effect on CPU The most effective use of a processor from a process perspective is to spend as much time possible executing user mode code. Kernel mode represents processor time associated with doing work, directly or indirectly, on behalf of a thread. This includes items such as synchronization, scheduling, I/O, memory management, and so on. Although this work is essential, it takes processor cycles and the cost, in cycles, to transition between user and kernel mode is expensive. Because all memory management and I/O functions must be done in kernel mode, it follows that the fewer the memory resources the more cycles are going to be spent managing those resources. A direct result of low memory is that the Working Set Manager, Modified Page Writer and Mapped Page Writer will have to use more cycles attempting to free memory. Analyzing Memory Look for Trends and Trend Relationships Troubleshooting performance is about analyzing trends and trend relationships. Establishing that some event happened is not enough. You must establish the effect of the event. For example, you note that paging activity is high at the same time that SQL Server becomes slow. These two individual facts may or may not be related. If the paging is not associated with SQL Servers working set, or the disks SQL is using there may be little or no cause/affect relationship. Look at Physical Memory First The first item to look at is physical memory. You need to know how much physical and page file space the system has to work with. You should then evaluate how much available memory there is. Just because the system has free memory does not mean that there is not any memory pressure. Available Bytes in combination with Pages Input/sec and Pages Output/sec can be a good indicator as to the amount of pressure. The goal in a perfect world is to have as little hard paging activity as possible with available memory greater than 5 MB. This is not to say that paging is bad. On the contrary, paging is a very effective way to manage a limited resource. Again, we are looking for trends that we can use to establish relationships. After evaluating physical memory, you should be able to answer the following questions:  How much physical memory do I have?  What is the commit limit?  Of that physical memory, how much has the operating system committed?  Is the operating system over committing physical memory?  What was the peak commit charge?  How much available physical memory is there?  What is the trend associated with committed and available? Review System Cache and Pool Contribution After you understand the individual process memory usage, you need to evaluate the System Cache and Pool usage. These can and often represent a significant portion of physical memory. Be aware that System Cache can grow significantly on a file server. This is usually normal. One thing to consider is that the file system cache tends to be the last thing trimmed when memory becomes low. If you see abrupt decreases in System Cache Resident Bytes when Available Bytes is below 5 MB you can be assured that the system is experiencing excessive memory pressure. Paged and non-paged pool size is also important to consider. An ever-increasing pool should be an indicator for further research. Non-paged pool growth is usually a driver issue, while paged pool could be driver-related or process-related. If paged pool is steadily growing, you should investigate each process to see if there is a specific process relationship. If not you will have to use tools such as poolmon to investigate further. Review Process Memory Usage After you understand the physical memory limitations and cache and pool contribution you need to determine what components or processes are creating the pressure on memory, if any. Be careful if you opt to chart the _Total Private Byte’s rollup for all processes. This value can be misleading in that it includes shared pages and can therefore exceed the actual amount of memory being used by the processes. The _Total rollup can also mask processes that are leaking memory because other processes may be freeing memory thus creating a balance between leaked and freed memory. Identify processes that expand their working set over time for further analysis. Also, review handles and threads because both use resources and potentially can be mismanaged. After evaluating the process resource usage, you should be able to answer the following:  Are any of the processes increasing their private bytes over time?  Are any processes growing their working set over time?  Are any processes increasing the number of threads or handles over time?  Are any processes increasing their use of pool over time?  Is there a direct relationship between the above named resources and total committed memory or available memory?  If there is a relationship, is this normal behavior for the process in question? For example, SQL does not commit ‘min memory’ on startup; these pages are faulted in into the working set as needed. This is not necessarily an indication of a memory leak.  If there is clearly a leak in the overview and is not identifiable in the process counters it is most likely in the pool.  If the leak in pool is not associated with SQL Server handles, then more often than not, it is not a SQL Server issue. There is however the possibility that the leak could be associated with third party XPROCS, SP_OA* calls or OLDB providers. Review Paging Activity and Its Impact on CPU and I/O As stated earlier, paging is not in and of itself a bad thing. When starting a process the system faults in the pages of an executable, as they are needed. This is preferable to loading the entire image at startup. The same can be said for memory mapped files and file system cache. All of these features leverage the ability of the system to fault in pages as needed The greatest impact of paging on a process is when the process must wait for an in-page fault or when page file activity represents a significant portion of the disk activity on the disk the application is actively using. After evaluating page fault activity, you should be able to answer the following questions:  What is the relationship between PageFaults/sec and Page Input/sec + Page Output/Sec?  What is the relationship if any between hard page faults and available memory?  Does paging activity represent a significant portion of processor or I/O resource usage? Don’t Prematurely Jump to Any Conclusions Analyzing memory pressure takes time and patience. An individual counter in and of it self means little. It is only when you start to explore relationships between cause and effect that you can begin to understand the impact of a particular counter. The key thoughts to remember are:  With the exception of a swap (when the entire process’s working set has been swapped out/in), hard page faults to resolve reads, are the most expensive in terms its effect on a processes perceived performance.  In general, page writes associated with page faults do not directly affect a process’s perceived performance, unless that process is waiting on a free page to be made available. Page file activity can become a problem if that activity competes for a significant percentage of the disk throughput in a heavy I/O orientated environment. That assumes of course that the page file resides on the same disk the application is using. Lab 3.1 System Memory Lab 3.1 Analyzing System Memory Using System Monitor Exercise 1 – Troubleshooting the Cardinal1.log File Students will evaluate an existing System Monitor log and determine if there is a problem and what the problem is. Students should be able to isolate the issue as a memory problem, locate the offending process, and determine whether or not this is a pool issue. Exercise 2 – Leakyapp Behavior Students will start leaky app and monitor memory, page file and cache counters to better understand the dynamics of these counters. Exercise 3 – Process Swap Due To Minimizing of the Cmd Window Students will start SQL from command line while viewing SQL process performance counters. Students will then minimize the window and note the effect on the working set. Overview What You Will Learn After completing this lab, you will be able to:  Use some of the basic functions within System Monitor.  Troubleshoot one or more common performance scenarios. Before You Begin Prerequisites To complete this lab, you need the following:  Windows 2000  SQL Server 2000  Lab Files Provided  LeakyApp.exe (Resource Kit) Estimated time to complete this lab: 45 minutes Exercise 1 Troubleshooting the Cardinal1.log File In this exercise, you will analyze a log file from an actual system that was having performance problems. Like an actual support engineer, you will not have much information from which to draw conclusions. The customer has sent you this log file and it is up to you to find the cause of the problem. However, unlike the real world, you have an instructor available to give you hints should you become stuck. Goal Review the Cardinal1.log file (this file is from Windows NT 4.0 Performance Monitor, which Windows 2000 can read). Chart the log file and begin to investigate the counters to determine what is causing the performance problems. Your goal should be to isolate the problem to a major area such as pool, virtual address space etc, and begin to isolate the problem to a specific process or thread. This lab requires access to the log file Cardinal1.log located in C:\LABS\M3\LAB1\EX1  To analyze the log file 1. Using the Performance MMC, select the System Monitor snap-in, and click the View Log File Data button (icon looks like a disk). 2. Under Files of type, choose PERFMON Log Files (*.log) 3. Navigate to the folder containing Cardinal1.log file and open it. 4. Begin examining counters to find what might be causing the performance problems. When examining some of these counters, you may notice that some of them go off the top of the chart. It may be necessary to adjust the scale on these. This can be done by right-clicking the rightmost pane and selecting Properties. Select the Data tab. Select the counter that you wish to modify. Under the Scale option, change the scale value, which makes the counter data visible on the chart. You may need to experiment with different scale values before finding the ideal value. Also, it may sometimes be beneficial to adjust the vertical scale for the entire chart. Selecting the Graph tab on the Properties page can do this. In the Vertical scale area, adjust the Maximum and Minimum values to best fit the data on the chart. Lab 3.1, Exercise 1: Results Exercise 2 LeakyApp Behavior In this lab, you will have an opportunity to work with a partner to monitor a live system, which is suffering from a simulated memory leak. Goal During this lab, your goal is to observe the system behavior when memory starts to become a limited resource. Specifically you will want to monitor committed memory, available memory, the system working set including the file system cache and each processes working set. At the end of the lab, you should be able to provide an answer to the listed questions.  To monitor a live system with a memory leak 1. Choose one of the two systems as a victim on which to run the leakyapp.exe program. It is recommended that you boot using the \MAXMEM=128 option so that this lab goes a little faster. You and your partner should decide which server will play the role of the problematic server and which server is to be used for monitoring purposes. 2. On the problematic server, start the leakyapp program. 3. On the monitoring system, create a counter that logs all necessary counters need to troubleshoot a memory problem. This should include physicaldisk counters if you think paging is a problem. Because it is likely that you will only need to capture less than five minutes of activity, the suggested interval for capturing is five seconds. 4. After the counters have been started, start the leaky application program 5. Click Start Leaking. The button will now change to Stop Leaking, which indicates that the system is now leaking memory. 6. After leakyapp shows the page file is 50 percent full, click Stop leaking. Note that the process has not given back its memory, yet. After approximately one minute, exit. Lab 3.1, Exercise 2: Questions After analyzing the counter logs you should be able to answer the following: 1. Under which system memory counter does the leak show up clearly? Memory:Committed Bytes 2. What process counter looked very similar to the overall system counter that showed the leak? Private Bytes 3. Is the leak in Paged Pool, Non-paged pool, or elsewhere? Elsewhere 4. At what point did Windows 2000 start to aggressively trim the working sets of all user processes? <5 MB Free 5. Was the System Working Set trimmed before or after the working sets of other processes? After 6. What counter showed this? Memory:Cache Bytes 7. At what point was the File System Cache trimmed? After the first pass through all other working sets 8. What was the effect on all the processes working set when the application quit leaking? None 9. What was the effect on all the working sets when the application exited? Nothing, initially; but all grew fairly quickly based on use 10. When the server was running low on memory, which was Windows spending more time doing, paging to disk or in-paging? Paging to disk, initially; however, as other applications began to run, in-paging increased Exercise 3 Minimizing a Command Window In this exercise, you will have an opportunity to observe the behavior of Windows 2000 when a command window is minimized. Goal During this lab, your goal is to observe the behavior of Windows 2000 when a command window becomes minimized. Specifically, you will want to monitor private bytes, virtual bytes, and working set of SQL Server when the command window is minimized. At the end of the lab, you should be able to provide an answer to the listed questions.  To monitor a command window’s working set as the window is minimized 1. Using System Monitor, create a counter list that logs all necessary counters needed to troubleshoot a memory problem. Because it is likely that you will only need to capture less than five minutes of activity, the suggested capturing interval is five seconds. 2. After the counters have been started, start a Command Prompt window on the target system. 3. In the command window, start SQL Server from the command line. Example: SQL Servr.exe –c –sINSTANCE1 4. After SQL Server has successfully started, Minimize the Command Prompt window. 5. Wait approximately two minutes, and then Restore the window. 6. Wait approximately two minutes, and then stop the counter log. Lab 3.1, Exercise 3: Questions After analyzing the counter logs you should be able to answer the following questions: 1. What was the effect on SQL Servers private bytes, virtual bytes, and working set when the window was minimized? Private Bytes and Virtual Bytes remained the same, while Working Set went to 0 2. What was the effect on SQL Servers private bytes, virtual bytes, and working set when the window was restored? None; the Working Set did not grow until SQL accessed the pages and faulted them back in on an as-needed basis SQL Server Memory Overview SQL Server Memory Overview Now that you have a better understanding of how Windows 2000 manages memory resources, you can take a closer look at how SQL Server 2000 manages its memory. During the course of the lecture and labs you will have the opportunity to monitor SQL Servers use of memory under varying conditions using both System Monitor counters and SQL Server tools. SQL Server Memory Management Goals Because SQL Server has in-depth knowledge about the relationships between data and the pages they reside on, it is in a better position to judge when and what pages should be brought into memory, how many pages should be brought in at a time, and how long they should be resident. SQL Servers primary goals for management of its memory are the following:  Be able to dynamically adjust for varying amounts of available memory.  Be able to respond to outside memory pressure from other applications.  Be able to adjust memory dynamically for internal components. Items Covered  SQL Server Memory Definitions  SQL Server Memory Layout  SQL Server Memory Counters  Memory Configurations Options  Buffer Pool Performance and Counters  Set Aside Memory and Counters  General Troubleshooting Process  Memory Myths and Tips SQL Server Memory Definitions SQL Server Memory Definitions Pool A group of resources, objects, or logical components that can service a resource allocation request Cache The management of a pool or resource, the primary goal of which is to increase performance. Bpool The Bpool (Buffer Pool) is a single static class instance. The Bpool is made up of 8-KB buffers and can be used to handle data pages or external memory requests. There are three basic types or categories of committed memory in the Bpool.  Hashed Data Pages  Committed Buffers on the Free List  Buffers known by their owners (Refer to definition of Stolen) Consumer A consumer is a subsystem that uses the Bpool. A consumer can also be a provider to other consumers. There are five consumers and two advanced consumers who are responsible for the different categories of memory. The following list represents the consumers and a partial list of their categories  Connection – Responsible for PSS and ODS memory allocations  General – Resource structures, parse headers, lock manager objects  Utilities – Recovery, Log Manager  Optimizer – Query Optimization  Query Plan – Query Plan Storage Advanced Consumer Along with the five consumers, there are two advanced consumers. They are  Ccache – Procedure cache. Accepts plans from the Optimizer and Query Plan consumers. Is responsible for managing that memory and determines when to release the memory back to the Bpool.  Log Cache – Managed by the LogMgr, which uses the Utility consumer to coordinate memory requests with the Bpool. Reservation Requesting the future use of a resource. A reservation is a reasonable guarantee that the resource will be available in the future. Committed Producing the physical resource Allocation The act of providing the resource to a consumer Stolen The act of getting a buffer from the Bpool is referred to as stealing a buffer. If the buffer is stolen and hashed for a data page, it is referred to as, and counted as, a Hashed buffer, not a stolen buffer. Stolen buffers on the other hand are buffers used for things such as procedure cache and SRV_PROC structures. Target Target memory is the amount of memory SQL Server would like to maintain as committed memory. Target memory is based on the min and max server configuration values and current available memory as reported by the operating system. Actual target calculation is operating system specific. Memory to Leave (Set Aside) The virtual address space set aside to ensure there is sufficient address space for thread stacks, XPROCS, COM objects etc. Hashed Page A page in pool that represents a database page. SQL Server Memory Layout Virtual Address Space When SQL Server is started the minimum of physical ram or virtual address space supported by the OS is evaluated. There are many possible combinations of OS versions and memory configurations. For example: you could be running Microsoft Windows 2000 Advanced Server with 2 GB or possibly 4 GB of memory. To avoid page file use, the appropriate memory level is evaluated for each configuration. Important Utilities can inject a DLL into the process address space by using HKEY_LOCAL_MACHINE\Software\Microsoft\Windows NT\CurrentVersion\Windows\AppInit_DLLs When the USER32.dll library is mapped into the process space, so, too, are the DLLs listed in the Registry key. To determine what DLL’s are running in SQL Server address space you can use tlist.exe. You can also use a tool such as Depends from Microsoft or HandelEx from http://ww.sysinternals.com. Memory to Leave As stated earlier there are many possible configurations of physical memory and address space. It is possible for physical memory to be greater than virtual address space. To ensure that some virtual address space is always available for things such as thread stacks and external needs such as XPROCS, SQL Server reserves a small portion of virtual address space prior to determining the size of the buffer pool. This address space is referred to as Memory To Leave. Its size is based on the number of anticipated tread stacks and a default value for external needs referred to as cmbAddressSave. After reserving the buffer pool space, the Memory To Leave reservation is released. Buffer Pool Space During Startup, SQL Server must determine the maximum size of the buffer pool so that the BUF, BUFHASH and COMMIT BITMAP structures that are used to manage the Bpool can be created. It is important to understand that SQL Server does not take ‘max memory’ or existing memory pressure into consideration. The reserved address space of the buffer pool remains static for the life of SQL Server process. However, the committed space varies as necessary to provide dynamic scaling. Remember only the committed memory effects the overall memory usage on the machine. This ensures that the max memory configuration setting can be dynamically changed with minimal changes needed to the Bpool. The reserved space does not need to be adjusted and is maximized for the current machine configuration. Only the committed buffers need to be limited to maintain a specified max server memory (MB) setting. SQL Server Startup Pseudo Code The following pseudo code represents the process SQL Server goes through on startup. Warning This example does not represent a completely accurate portrayal of the steps SQL Server takes when initializing the buffer pool. Several details have been left out or glossed over. The intent of this example is to help you understand the general process, not the specific details.  Determine the size of cmbAddressSave (-g)  Determine Total Physical Memory  Determine Available Physical Memory  Determine Total Virtual Memory  Calculate MemToLeave maxworkterthreads * (stacksize=512 KB) + (cmbAddressSave = 256 MB)  Reserve MemToLeave and set PAGE_NOACCESS  Check for AWE, test to see if it makes sense to use it and log the results • Min(Available Memory, Max Server Memory) > Virtual Memory • Supports Read Scatter • SQL Server not started with -f • AWE Enabled via sp_configure • Enterprise Edition • Lock Pages In Memory user right enabled  Calculate Virtual Address Limit VA Limit = Min(Physical Memory, Virtual Memory – MemtoLeave)  Calculate the number of physical and virtual buffers that can be supported AWE Present Physical Buffers = (RAM / (PAGESIZE + Physical Overhead)) Virtual Buffers = (VA Limit / (PAGESIZE + Virtual Overhead)) AWE Not Present Physical Buffers = Virtual Buffers = VA Limit / (PAGESIZE + Physical Overhead + Virtual Overhead)  Make sure we have the minimum number of buffers Physical Buffers = Max(Physical Buffers, MIN_BUFFERS)  Allocate and commit the buffer management structures  Reserve the address space required to support the Bpool buffers  Release the MemToLeave SQL Server Startup Pseudo Code Example The following is an example based on the pseudo code represented on the previous page. This example is based on a machine with 384 MB of physical memory, not using AWE or /3GB. Note CmbAddressSave was changed between SQL Server 7.0 and SQL Server 2000. For SQL Server 7.0, cmbAddressSave was 128. Warning This example does not represent a completely accurate portrayal of the steps SQL Server takes when initializing the buffer pool. Several details have been left out or glossed over. The intent of this example is to help you understand the general process, not the specific details.  Determine the size of cmbAddressSave (No –g so 256MB)  Determine Total Physical Memory (384)  Determine Available Physical Memory (384)  Determine Total Virtual Memory (2GB)  Calculate MemToLeave maxworkterthreads * (stacksize=512 KB) + (cmbAddressSave = 256 MB) (255 * .5MB + 256MB = 384MB)  Reserve MemToLeave and set PAGE_NOACCESS  Check for AWE, test to see if it makes sense to use it and log the results (AWE Not Enabled)  Calculate Virtual Address Limit VA Limit = Min(Physical Memory, Virtual Memory – MemtoLeave) 384MB = Min(384MB, 2GB – 384MB)  Calculate the number of physical and virtual buffers that can be supported AWE Not Present 48664 (approx) = 384 MB / (8 KB + Overhead)  Make sure we have the minimum number of buffers Physical Buffers = Max(Physical Buffers, MIN_BUFFERS) 48664 = Max(48664,1024)  Allocate and commit the buffer management structures  Reserve the address space required to support the Bpool buffers  Release the MemToLeave Tip Trace Flag 1604 can be used to view memory allocations on startup. The cmbAddressSave can be adjusted using the –g XXX startup parameter. SQL Server Memory Counters SQL Server Memory Counters The two primary tools for monitoring and analyzing SQL Server memory usage are System Monitor and DBCC MEMORYSTATUS. For detailed information on DBCC MEMORYSTATUS refer to Q271624 Interpreting the Output of the DBCC MEMORYSTAUS Command. Important Represents SQL Server 2000 Counters. The counters presented are not the same as the counters for SQL Server 7.0. The SQL Server 7.0 counters are listed in the appendix. Determining Memory Usage for OS and BPOOL Memory Manager: Total Server memory (KB) - Represents all of SQL usage Buffer Manager: Total Pages - Represents total bpool usage To determine how much of Total Server Memory (KB) represents MemToLeave space; subtract Buffer Manager: Total Pages. The result can be verified against DBCC MEMORYSTATUS, specifically Dynamic Memory Manager: OS In Use. It should however be noted that this value only represents requests that went thru the bpool. Memory reserved outside of the bpool by components such as COM objects will not show up here, although they will count against SQL Server private byte count. Buffer Counts: Target (Buffer Manager: Target Pages) The size the buffer pool would like to be. If this value is larger than committed, the buffer pool is growing. Buffer Counts: Committed (Buffer Manager: Total Pages) The total number of buffers committed in the OS. This is the current size of the buffer pool. Buffer Counts: Min Free This is the number of pages that the buffer pool tries to keep on the free list. If the free list falls below this value, the buffer pool will attempt to populate it by discarding old pages from the data or procedure cache. Buffer Distribution: Free (Buffer Manager / Buffer Partition: Free Pages) This value represents the buffers currently not in use. These are available for data or may be requested by other components and mar

Table of Contents Header Files The #define Guard Header File Dependencies Inline Functions The -inl.h Files Function Parameter Ordering Names and Order of Includes Scoping Namespaces Nested Classes Nonmember, Static Member, and Global Functions Local Variables Static and Global Variables Classes Doing Work in Constructors Default Constructors Explicit Constructors Copy Constructors Structs vs. Classes Inheritance Multiple Inheritance Interfaces Operator Overloading Access Control Declaration Order Write Short Functions Google-Specific Magic Smart Pointers cpplint Other C++ Features Reference Arguments Function Overloading Default Arguments Variable-Length Arrays and alloca() Friends Exceptions Run-Time Type Information (RTTI) Casting Streams Preincrement and Predecrement Use of const Integer Types 64-bit Portability Preprocessor Macros 0 and NULL sizeof Boost C++0x Naming General Naming Rules File Names Type Names Variable Names Constant Names Function Names Namespace Names Enumerator Names Macro Names Exceptions to Naming Rules Comments Comment Style File Comments Class Comments Function Comments Variable Comments Implementation Comments Punctuation, Spelling and Grammar TODO Comments Deprecation Comments Formatting Line Length Non-ASCII Characters Spaces vs. Tabs Function Declarations and Definitions Function Calls Conditionals Loops and Switch Statements Pointer and Reference Expressions Boolean Expressions Return Values Variable and Array Initialization Preprocessor Directives Class Format Constructor Initializer Lists Namespace Formatting Horizontal Whitespace Vertical Whitespace Exceptions to the Rules Existing Non-conformant Code Windows Code Important Note Displaying Hidden Details in this Guide link ▶This style guide contains many details that are initially hidden from view. They are marked by the triangle icon, which you see here on your left. Click it now. You should see "Hooray" appear below. Hooray! Now you know you can expand points to get more details. Alternatively, there's an "expand all" at the top of this document. Background C++ is the main development language used by many of Google's open-source projects. As every C++ programmer knows, the language has many powerful features, but this power brings with it complexity, which in turn can make code more bug-prone and harder to read and maintain. The goal of this guide is to manage this complexity by describing in detail the dos and don'ts of writing C++ code. These rules exist to keep the code base manageable while still allowing coders to use C++ language features productively. Style, also known as readability, is what we call the conventions that govern our C++ code. The term Style is a bit of a misnomer, since these conventions cover far more than just source file formatting. One way in which we keep the code base manageable is by enforcing consistency. It is very important that any programmer be able to look at another's code and quickly understand it. Maintaining a uniform style and following conventions means that we can more easily use "pattern-matching" to infer what various symbols are and what invariants are true about them. Creating common, required idioms and patterns makes code much easier to understand. In some cases there might be good arguments for changing certain style rules, but we nonetheless keep things as they are in order to preserve consistency. Another issue this guide addresses is that of C++ feature bloat. C++ is a huge language with many advanced features. In some cases we constrain, or even ban, use of certain features. We do this to keep code simple and to avoid the various common errors and problems that these features can cause. This guide lists these features and explains why their use is restricted. Open-source projects developed by Google conform to the requirements in this guide. Note that this guide is not a C++ tutorial: we assume that the reader is familiar with the language. Header Files In general, every .cc file should have an associated .h file. There are some common exceptions, such as unittests and small .cc files containing just a main() function. Correct use of header files can make a huge difference to the readability, size and performance of your code. The following rules will guide you through the various pitfalls of using header files. The #define Guard link ▶All header files should have #define guards to prevent multiple inclusion. The format of the symbol name should be ___H_. To guarantee uniqueness, they should be based on the full path in a project's source tree. For example, the file foo/src/bar/baz.h in project foo should have the following guard: #ifndef FOO_BAR_BAZ_H_ #define FOO_BAR_BAZ_H_ ... #endif // FOO_BAR_BAZ_H_ Header File Dependencies link ▶Don't use an #include when a forward declaration would suffice. When you include a header file you introduce a dependency that will cause your code to be recompiled whenever the header file changes. If your header file includes other header files, any change to those files will cause any code that includes your header to be recompiled. Therefore, we prefer to minimize includes, particularly includes of header files in other header files. You can significantly minimize the number of header files you need to include in your own header files by using forward declarations. For example, if your header file uses the File class in ways that do not require access to the declaration of the File class, your header file can just forward declare class File; instead of having to #include "file/base/file.h". How can we use a class Foo in a header file without access to its definition? We can declare data members of type Foo* or Foo&. We can declare (but not define) functions with arguments, and/or return values, of type Foo. (One exception is if an argument Foo or const Foo& has a non-explicit, one-argument constructor, in which case we need the full definition to support automatic type conversion.) We can declare static data members of type Foo. This is because static data members are defined outside the class definition. On the other hand, you must include the header file for Foo if your class subclasses Foo or has a data member of type Foo. Sometimes it makes sense to have pointer (or better, scoped_ptr) members instead of object members. However, this complicates code readability and imposes a performance penalty, so avoid doing this transformation if the only purpose is to minimize includes in header files. Of course, .cc files typically do require the definitions of the classes they use, and usually have to include several header files. Note: If you use a symbol Foo in your source file, you should bring in a definition for Foo yourself, either via an #include or via a forward declaration. Do not depend on the symbol being brought in transitively via headers not directly included. One exception is if Foo is used in myfile.cc, it's ok to #include (or forward-declare) Foo in myfile.h, instead of myfile.cc. Inline Functions link ▶Define functions inline only when they are small, say, 10 lines or less. Definition: You can declare functions in a way that allows the compiler to expand them inline rather than calling them through the usual function call mechanism. Pros: Inlining a function can generate more efficient object code, as long as the inlined function is small. Feel free to inline accessors and mutators, and other short, performance-critical functions. Cons: Overuse of inlining can actually make programs slower. Depending on a function's size, inlining it can cause the code size to increase or decrease. Inlining a very small accessor function will usually decrease code size while inlining a very large function can dramatically increase code size. On modern processors smaller code usually runs faster due to better use of the instruction cache. Decision: A decent rule of thumb is to not inline a function if it is more than 10 lines long. Beware of destructors, which are often longer than they appear because of implicit member- and base-destructor calls! Another useful rule of thumb: it's typically not cost effective to inline functions with loops or switch statements (unless, in the common case, the loop or switch statement is never executed). It is important to know that functions are not always inlined even if they are declared as such; for example, virtual and recursive functions are not normally inlined. Usually recursive functions should not be inline. The main reason for making a virtual function inline is to place its definition in the class, either for convenience or to document its behavior, e.g., for accessors and mutators. The -inl.h Files link ▶You may use file names with a -inl.h suffix to define complex inline functions when needed. The definition of an inline function needs to be in a header file, so that the compiler has the definition available for inlining at the call sites. However, implementation code properly belongs in .cc files, and we do not like to have much actual code in .h files unless there is a readability or performance advantage. If an inline function definition is short, with very little, if any, logic in it, you should put the code in your .h file. For example, accessors and mutators should certainly be inside a class definition. More complex inline functions may also be put in a .h file for the convenience of the implementer and callers, though if this makes the .h file too unwieldy you can instead put that code in a separate -inl.h file. This separates the implementation from the class definition, while still allowing the implementation to be included where necessary. Another use of -inl.h files is for definitions of function templates. This can be used to keep your template definitions easy to read. Do not forget that a -inl.h file requires a #define guard just like any other header file. Function Parameter Ordering link ▶When defining a function, parameter order is: inputs, then outputs. Parameters to C/C++ functions are either input to the function, output from the function, or both. Input parameters are usually values or const references, while output and input/output parameters will be non-const pointers. When ordering function parameters, put all input-only parameters before any output parameters. In particular, do not add new parameters to the end of the function just because they are new; place new input-only parameters before the output parameters. This is not a hard-and-fast rule. Parameters that are both input and output (often classes/structs) muddy the waters, and, as always, consistency with related functions may require you to bend the rule. Names and Order of Includes link ▶Use standard order for readability and to avoid hidden dependencies: C library, C++ library, other libraries' .h, your project's .h. All of a project's header files should be listed as descentants of the project's source directory without use of UNIX directory shortcuts . (the current directory) or .. (the parent directory). For example, google-awesome-project/src/base/logging.h should be included as #include "base/logging.h" In dir/foo.cc, whose main purpose is to implement or test the stuff in dir2/foo2.h, order your includes as follows: dir2/foo2.h (preferred location — see details below). C system files. C++ system files. Other libraries' .h files. Your project's .h files. The preferred ordering reduces hidden dependencies. We want every header file to be compilable on its own. The easiest way to achieve this is to make sure that every one of them is the first .h file #included in some .cc. dir/foo.cc and dir2/foo2.h are often in the same directory (e.g. base/basictypes_test.cc and base/basictypes.h), but can be in different directories too. Within each section it is nice to order the includes alphabetically. For example, the includes in google-awesome-project/src/foo/internal/fooserver.cc might look like this: #include "foo/public/fooserver.h" // Preferred location. #include #include #include #include #include "base/basictypes.h" #include "base/commandlineflags.h" #include "foo/public/bar.h" Scoping Namespaces link ▶Unnamed namespaces in .cc files are encouraged. With named namespaces, choose the name based on the project, and possibly its path. Do not use a using-directive. Definition: Namespaces subdivide the global scope into distinct, named scopes, and so are useful for preventing name collisions in the global scope. Pros: Namespaces provide a (hierarchical) axis of naming, in addition to the (also hierarchical) name axis provided by classes. For example, if two different projects have a class Foo in the global scope, these symbols may collide at compile time or at runtime. If each project places their code in a namespace, project1::Foo and project2::Foo are now distinct symbols that do not collide. Cons: Namespaces can be confusing, because they provide an additional (hierarchical) axis of naming, in addition to the (also hierarchical) name axis provided by classes. Use of unnamed spaces in header files can easily cause violations of the C++ One Definition Rule (ODR). Decision: Use namespaces according to the policy described below. Unnamed Namespaces Unnamed namespaces are allowed and even encouraged in .cc files, to avoid runtime naming conflicts: namespace { // This is in a .cc file. // The content of a namespace is not indented enum { kUnused, kEOF, kError }; // Commonly used tokens. bool AtEof() { return pos_ == kEOF; } // Uses our namespace's EOF. } // namespace However, file-scope declarations that are associated with a particular class may be declared in that class as types, static data members or static member functions rather than as members of an unnamed namespace. Terminate the unnamed namespace as shown, with a comment // namespace. Do not use unnamed namespaces in .h files. Named Namespaces Named namespaces should be used as follows: Namespaces wrap the entire source file after includes, gflags definitions/declarations, and forward declarations of classes from other namespaces: // In the .h file namespace mynamespace { // All declarations are within the namespace scope. // Notice the lack of indentation. class MyClass { public: ... void Foo(); }; } // namespace mynamespace // In the .cc file namespace mynamespace { // Definition of functions is within scope of the namespace. void MyClass::Foo() { ... } } // namespace mynamespace The typical .cc file might have more complex detail, including the need to reference classes in other namespaces. #include "a.h" DEFINE_bool(someflag, false, "dummy flag"); class C; // Forward declaration of class C in the global namespace. namespace a { class A; } // Forward declaration of a::A. namespace b { ...code for b... // Code goes against the left margin. } // namespace b Do not declare anything in namespace std, not even forward declarations of standard library classes. Declaring entities in namespace std is undefined behavior, i.e., not portable. To declare entities from the standard library, include the appropriate header file. You may not use a using-directive to make all names from a namespace available. // Forbidden -- This pollutes the namespace. using namespace foo; You may use a using-declaration anywhere in a .cc file, and in functions, methods or classes in .h files. // OK in .cc files. // Must be in a function, method or class in .h files. using ::foo::bar; Namespace aliases are allowed anywhere in a .cc file, anywhere inside the named namespace that wraps an entire .h file, and in functions and methods. // Shorten access to some commonly used names in .cc files. namespace fbz = ::foo::bar::baz; // Shorten access to some commonly used names (in a .h file). namespace librarian { // The following alias is available to all files including // this header (in namespace librarian): // alias names should therefore be chosen consistently // within a project. namespace pd_s = ::pipeline_diagnostics::sidetable; inline void my_inline_function() { // namespace alias local to a function (or method). namespace fbz = ::foo::bar::baz; ... } } // namespace librarian Note that an alias in a .h file is visible to everyone #including that file, so public headers (those available outside a project) and headers transitively #included by them, should avoid defining aliases, as part of the general goal of keeping public APIs as small as possible. Nested Classes link ▶Although you may use public nested classes when they are part of an interface, consider a namespace to keep declarations out of the global scope. Definition: A class can define another class within it; this is also called a member class. class Foo { private: // Bar is a member class, nested within Foo. class Bar { ... }; }; Pros: This is useful when the nested (or member) class is only used by the enclosing class; making it a member puts it in the enclosing class scope rather than polluting the outer scope with the class name. Nested classes can be forward declared within the enclosing class and then defined in the .cc file to avoid including the nested class definition in the enclosing class declaration, since the nested class definition is usually only relevant to the implementation. Cons: Nested classes can be forward-declared only within the definition of the enclosing class. Thus, any header file manipulating a Foo::Bar* pointer will have to include the full class declaration for Foo. Decision: Do not make nested classes public unless they are actually part of the interface, e.g., a class that holds a set of options for some method. Nonmember, Static Member, and Global Functions link ▶Prefer nonmember functions within a namespace or static member functions to global functions; use completely global functions rarely. Pros: Nonmember and static member functions can be useful in some situations. Putting nonmember functions in a namespace avoids polluting the global namespace. Cons: Nonmember and static member functions may make more sense as members of a new class, especially if they access external resources or have significant dependencies. Decision: Sometimes it is useful, or even necessary, to define a function not bound to a class instance. Such a function can be either a static member or a nonmember function. Nonmember functions should not depend on external variables, and should nearly always exist in a namespace. Rather than creating classes only to group static member functions which do not share static data, use namespaces instead. Functions defined in the same compilation unit as production classes may introduce unnecessary coupling and link-time dependencies when directly called from other compilation units; static member functions are particularly susceptible to this. Consider extracting a new class, or placing the functions in a namespace possibly in a separate library. If you must define a nonmember function and it is only needed in its .cc file, use an unnamed namespace or static linkage (eg static int Foo() {...}) to limit its scope. Local Variables link ▶Place a function's variables in the narrowest scope possible, and initialize variables in the declaration. C++ allows you to declare variables anywhere in a function. We encourage you to declare them in as local a scope as possible, and as close to the first use as possible. This makes it easier for the reader to find the declaration and see what type the variable is and what it was initialized to. In particular, initialization should be used instead of declaration and assignment, e.g. int i; i = f(); // Bad -- initialization separate from declaration. int j = g(); // Good -- declaration has initialization. Note that gcc implements for (int i = 0; i < 10; ++i) correctly (the scope of i is only the scope of the for loop), so you can then reuse i in another for loop in the same scope. It also correctly scopes declarations in if and while statements, e.g. while (const char* p = strchr(str, '/')) str = p + 1; There is one caveat: if the variable is an object, its constructor is invoked every time it enters scope and is created, and its destructor is invoked every time it goes out of scope. // Inefficient implementation: for (int i = 0; i < 1000000; ++i) { Foo f; // My ctor and dtor get called 1000000 times each. f.DoSomething(i); } It may be more efficient to declare such a variable used in a loop outside that loop: Foo f; // My ctor and dtor get called once each. for (int i = 0; i < 1000000; ++i) { f.DoSomething(i); } Static and Global Variables link ▶Static or global variables of class type are forbidden: they cause hard-to-find bugs due to indeterminate order of construction and destruction. Objects with static storage duration, including global variables, static variables, static class member variables, and function static variables, must be Plain Old Data (POD): only ints, chars, floats, or pointers, or arrays/structs of POD. The order in which class constructors and initializers for static variables are called is only partially specified in C++ and can even change from build to build, which can cause bugs that are difficult to find. Therefore in addition to banning globals of class type, we do not allow static POD variables to be initialized with the result of a function, unless that function (such as getenv(), or getpid()) does not itself depend on any other globals. Likewise, the order in which destructors are called is defined to be the reverse of the order in which the constructors were called. Since constructor order is indeterminate, so is destructor order. For example, at program-end time a static variable might have been destroyed, but code still running -- perhaps in another thread -- tries to access it and fails. Or the destructor for a static 'string' variable might be run prior to the destructor for another variable that contains a reference to that string. As a result we only allow static variables to contain POD data. This rule completely disallows vector (use C arrays instead), or string (use const char []). If you need a static or global variable of a class type, consider initializing a pointer (which will never be freed), from either your main() function or from pthread_once(). Note that this must be a raw pointer, not a "smart" pointer, since the smart pointer's destructor will have the order-of-destructor issue that we are trying to avoid. Classes Classes are the fundamental unit of code in C++. Naturally, we use them extensively. This section lists the main dos and don'ts you should follow when writing a class. Doing Work in Constructors link ▶In general, constructors should merely set member variables to their initial values. Any complex initialization should go in an explicit Init() method. Definition: It is possible to perform initialization in the body of the constructor. Pros: Convenience in typing. No need to worry about whether the class has been initialized or not. Cons: The problems with doing work in constructors are: There is no easy way for constructors to signal errors, short of using exceptions (which are forbidden). If the work fails, we now have an object whose initialization code failed, so it may be an indeterminate state. If the work calls virtual functions, these calls will not get dispatched to the subclass implementations. Future modification to your class can quietly introduce this problem even if your class is not currently subclassed, causing much confusion. If someone creates a global variable of this type (which is against the rules, but still), the constructor code will be called before main(), possibly breaking some implicit assumptions in the constructor code. For instance, gflags will not yet have been initialized. Decision: If your object requires non-trivial initialization, consider having an explicit Init() method. In particular, constructors should not call virtual functions, attempt to raise errors, access potentially uninitialized global variables, etc. Default Constructors link ▶You must define a default constructor if your class defines member variables and has no other constructors. Otherwise the compiler will do it for you, badly. Definition: The default constructor is called when we new a class object with no arguments. It is always called when calling new[] (for arrays). Pros: Initializing structures by default, to hold "impossible" values, makes debugging much easier. Cons: Extra work for you, the code writer. Decision: If your class defines member variables and has no other constructors you must define a default constructor (one that takes no arguments). It should preferably initialize the object in such a way that its internal state is consistent and valid. The reason for this is that if you have no other constructors and do not define a default constructor, the compiler will generate one for you. This compiler generated constructor may not initialize your object sensibly. If your class inherits from an existing class but you add no new member variables, you are not required to have a default constructor. Explicit Constructors link ▶Use the C++ keyword explicit for constructors with one argument. Definition: Normally, if a constructor takes one argument, it can be used as a conversion. For instance, if you define Foo::Foo(string name) and then pass a string to a function that expects a Foo, the constructor will be called to convert the string into a Foo and will pass the Foo to your function for you. This can be convenient but is also a source of trouble when things get converted and new objects created without you meaning them to. Declaring a constructor explicit prevents it from being invoked implicitly as a conversion. Pros: Avoids undesirable conversions. Cons: None. Decision: We require all single argument constructors to be explicit. Always put explicit in front of one-argument constructors in the class definition: explicit Foo(string name); The exception is copy constructors, which, in the rare cases when we allow them, should probably not be explicit. Classes that are intended to be transparent wrappers around other classes are also exceptions. Such exceptions should be clearly marked with comments. Copy Constructors link ▶Provide a copy constructor and assignment operator only when necessary. Otherwise, disable them with DISALLOW_COPY_AND_ASSIGN. Definition: The copy constructor and assignment operator are used to create copies of objects. The copy constructor is implicitly invoked by the compiler in some situations, e.g. passing objects by value. Pros: Copy constructors make it easy to copy objects. STL containers require that all contents be copyable and assignable. Copy constructors can be more efficient than CopyFrom()-style workarounds because they combine construction with copying, the compiler can elide them in some contexts, and they make it easier to avoid heap allocation. Cons: Implicit copying of objects in C++ is a rich source of bugs and of performance problems. It also reduces readability, as it becomes hard to track which objects are being passed around by value as opposed to by reference, and therefore where changes to an object are reflected. Decision: Few classes need to be copyable. Most should have neither a copy constructor nor an assignment operator. In many situations, a pointer or reference will work just as well as a copied value, with better performance. For example, you can pass function parameters by reference or pointer instead of by value, and you can store pointers rather than objects in an STL container. If your class needs to be copyable, prefer providing a copy method, such as CopyFrom() or Clone(), rather than a copy constructor, because such methods cannot be invoked implicitly. If a copy method is insufficient in your situation (e.g. for performance reasons, or because your class needs to be stored by value in an STL container), provide both a copy constructor and assignment operator. If your class does not need a copy constructor or assignment operator, you must explicitly disable them. To do so, add dummy declarations for the copy constructor and assignment operator in the private: section of your class, but do not provide any corresponding definition (so that any attempt to use them results in a link error). For convenience, a DISALLOW_COPY_AND_ASSIGN macro can be used: // A macro to disallow the copy constructor and operator= functions // This should be used in the private: declarations for a class #define DISALLOW_COPY_AND_ASSIGN(TypeName) \ TypeName(const TypeName&); \ void operator=(const TypeName&) Then, in class Foo: class Foo { public: Foo(int f); ~Foo(); private: DISALLOW_COPY_AND_ASSIGN(Foo); }; Structs vs. Classes link ▶Use a struct only for passive objects that carry data; everything else is a class. The struct and class keywords behave almost identically in C++. We add our own semantic meanings to each keyword, so you should use the appropriate keyword for the data-type you're defining. structs should be used for passive objects that carry data, and may have associated constants, but lack any functionality other than access/setting the data members. The accessing/setting of fields is done by directly accessing the fields rather than through method invocations. Methods should not provide behavior but should only be used to set up the data members, e.g., constructor, destructor, Initialize(), Reset(), Validate(). If more functionality is required, a class is more appropriate. If in doubt, make it a class. For consistency with STL, you can use struct instead of class for functors and traits. Note that member variables in structs and classes have different naming rules. Inheritance link ▶Composition is often more appropriate than inheritance. When using inheritance, make it public. Definition: When a sub-class inherits from a base class, it includes the definitions of all the data and operations that the parent base class defines. In practice, inheritance is used in two major ways in C++: implementation inheritance, in which actual code is inherited by the child, and interface inheritance, in which only method names are inherited. Pros: Implementation inheritance reduces code size by re-using the base class code as it specializes an existing type. Because inheritance is a compile-time declaration, you and the compiler can understand the operation and detect errors. Interface inheritance can be used to programmatically enforce that a class expose a particular API. Again, the compiler can detect errors, in this case, when a class does not define a necessary method of the API. Cons: For implementation inheritance, because the code implementing a sub-class is spread between the base and the sub-class, it can be more difficult to understand an implementation. The sub-class cannot override functions that are not virtual, so the sub-class cannot change implementation. The base class may also define some data members, so that specifies physical layout of the base class. Decision: All inheritance should be public. If you want to do private inheritance, you should be including an instance of the base class as a member instead. Do not overuse implementation inheritance. Composition is often more appropriate. Try to restrict use of inheritance to the "is-a" case: Bar subclasses Foo if it can reasonably be said that Bar "is a kind of" Foo. Make your destructor virtual if necessary. If your class has virtual methods, its destructor should be virtual. Limit the use of protected to those member functions that might need to be accessed from subclasses. Note that data members should be private. When redefining an inherited virtual function, explicitly declare it virtual in the declaration of the derived class. Rationale: If virtual is omitted, the reader has to check all ancestors of the class in question to determine if the function is virtual or not. Multiple Inheritance link ▶Only very rarely is multiple implementation inheritance actually useful. We allow multiple inheritance only when at most one of the base classes has an implementation; all other base classes must be pure interface classes tagged with the Interface suffix. Definition: Multiple inheritance allows a sub-class to have more than one base class. We distinguish between base classes that are pure interfaces and those that have an implementation. Pros: Multiple implementation inheritance may let you re-use even more code than single inheritance (see Inheritance). Cons: Only very rarely is multiple implementation inheritance actually useful. When multiple implementation inheritance seems like the solution, you can usually find a different, more explicit, and cleaner solution. Decision: Multiple inheritance is allowed only when all superclasses, with the possible exception of the first one, are pure interfaces. In order to ensure that they remain pure interfaces, they must end with the Interface suffix. Note: There is an exception to this rule on Windows. Interfaces link ▶Classes that satisfy certain conditions are allowed, but not required, to end with an Interface suffix. Definition: A class is a pure interface if it meets the following requirements: It has only public pure virtual ("= 0") methods and static methods (but see below for destructor). It may not have non-static data members. It need not have any constructors defined. If a constructor is provided, it must take no arguments and it must be protected. If it is a subclass, it may only be derived from classes that satisfy these conditions and are tagged with the Interface suffix. An interface class can never be directly instantiated because of the pure virtual method(s) it declares. To make sure all implementations of the interface can be destroyed correctly, they must also declare a virtual destructor (in an exception to the first rule, this should not be pure). See Stroustrup, The C++ Programming Language, 3rd edition, section 12.4 for details. Pros: Tagging a class with the Interface suffix lets others know that they must not add implemented methods or non static data members. This is particularly important in the case of multiple inheritance. Additionally, the interface concept is already well-understood by Java programmers. Cons: The Interface suffix lengthens the class name, which can make it harder to read and understand. Also, the interface property may be considered an implementation detail that shouldn't be exposed to clients. Decision: A class may end with Interface only if it meets the above requirements. We do not require the converse, however: classes that meet the above requirements are not required to end with Interface. Operator Overloading link ▶Do not overload operators except in rare, special circumstances. Definition: A class can define that operators such as + and / operate on the class as if it were a built-in type. Pros: Can make code appear more intuitive because a class will behave in the same way as built-in types (such as int). Overloaded operators are more playful names for functions that are less-colorfully named, such as Equals() or Add(). For some template functions to work correctly, you may need to define operators. Cons: While operator overloading can make code more intuitive, it has several drawbacks: It can fool our intuition into thinking that expensive operations are cheap, built-in operations. It is much harder to find the call sites for overloaded operators. Searching for Equals() is much easier than searching for relevant invocations of ==. Some operators work on pointers too, making it easy to introduce bugs. Foo + 4 may do one thing, while &Foo + 4 does something totally different. The compiler does not complain for either of these, making this very hard to debug. Overloading also has surprising ramifications. For instance, if a class overloads unary operator&, it cannot safely be forward-declared. Decision: In general, do not overload operators. The assignment operator (operator=), in particular, is insidious and should be avoided. You can define functions like Equals() and CopyFrom() if you need them. Likewise, avoid the dangerous unary operator& at all costs, if there's any possibility the class might be forward-declared. However, there may be rare cases where you need to overload an operator to interoperate with templates or "standard" C++ classes (such as operator<<(ostream&, const T&) for logging). These are acceptable if fully justified, but you should try to avoid these whenever possible. In particular, do not overload operator== or operator< just so that your class can be used as a key in an STL container; instead, you should create equality and comparison functor types when declaring the container. Some of the STL algorithms do require you to overload operator==, and you may do so in these cases, provided you document why. See also Copy Constructors and Function Overloading. Access Control link ▶Make data members private, and provide access to them through accessor functions as needed (for technical reasons, we allow data members of a test fixture class to be protected when using Google Test). Typically a variable would be called foo_ and the accessor function foo(). You may also want a mutator function set_foo(). Exception: static const data members (typically called kFoo) need not be private. The definitions of accessors are usually inlined in the header file. See also Inheritance and Function Names. Declaration Order link ▶Use the specified order of declarations within a class: public: before private:, methods before data members (variables), etc. Your class definition should start with its public: section, followed by its protected: section and then its private: section. If any of these sections are empty, omit them. Within each section, the declarations generally should be in the following order: Typedefs and Enums Constants (static const data members) Constructors Destructor Methods, including static methods Data Members (except static const data members) Friend declarations should always be in the private section, and the DISALLOW_COPY_AND_ASSIGN macro invocation should be at the end of the private: section. It should be the last thing in the class. See Copy Constructors. Method definitions in the corresponding .cc file should be the same as the declaration order, as much as possible. Do not put large method definitions inline in the class definition. Usually, only trivial or performance-critical, and very short, methods may be defined inline. See Inline Functions for more details. Write Short Functions link ▶Prefer small and focused functions. We recognize that long functions are sometimes appropriate, so no hard limit is placed on functions length. If a function exceeds about 40 lines, think about whether it can be broken up without harming the structure of the program. Even if your long function works perfectly now, someone modifying it in a few months may add new behavior. This could result in bugs that are hard to find. Keeping your functions short and simple makes it easier for other people to read and modify your code. You could find long and complicated functions when working with some code. Do not be intimidated by modifying existing code: if working with such a function proves to be difficult, you find that errors are hard to debug, or you want to use a piece of it in several different contexts, consider breaking up the function into smaller and more manageable pieces. Google-Specific Magic There are various tricks and utilities that we use to make C++ code more robust, and various ways we use C++ that may differ from what you see elsewhere. Smart Pointers link ▶If you actually need pointer semantics, scoped_ptr is great. You should only use std::tr1::shared_ptr under very specific conditions, such as when objects need to be held by STL containers. You should never use auto_ptr. "Smart" pointers are objects that act like pointers but have added semantics. When a scoped_ptr is destroyed, for instance, it deletes the object it's pointing to. shared_ptr is the same way, but implements reference-counting so only the last pointer to an object deletes it. Generally speaking, we prefer that we design code with clear object ownership. The clearest object ownership is obtained by using an object directly as a field or local variable, without using pointers at all. On the other extreme, by their very definition, reference counted pointers are owned by nobody. The problem with this design is that it is easy to create circular references or other strange conditions that cause an object to never be deleted. It is also slow to perform atomic operations every time a value is copied or assigned. Although they are not recommended, reference counted pointers are sometimes the simplest and most elegant way to solve a problem. cpplint link ▶Use cpplint.py to detect style errors. cpplint.py is a tool that reads a source file and identifies many style errors. It is not perfect, and has both false positives and false negatives, but it is still a valuable tool. False positives can be ignored by putting // NOLINT at the end of the line. Some projects have instructions on how to run cpplint.py from their project tools. If the project you are contributing to does not, you can download cpplint.py separately. Other C++ Features Reference Arguments link ▶All parameters passed by reference must be labeled const. Definition: In C, if a function needs to modify a variable, the parameter must use a pointer, eg int foo(int *pval). In C++, the function can alternatively declare a reference parameter: int foo(int &val). Pros: Defining a parameter as reference avoids ugly code like (*pval)++. Necessary for some applications like copy constructors. Makes it clear, unlike with pointers, that NULL is not a possible value. Cons: References can be confusing, as they have value syntax but pointer semantics. Decision: Within function parameter lists all references must be const: void Foo(const string &in, string *out); In fact it is a very strong convention in Google code that input arguments are values or const references while output arguments are pointers. Input parameters may be const pointers, but we never allow non-const reference parameters. One case when you might want an input parameter to be a const pointer is if you want to emphasize that the argument is not copied, so it must exist for the lifetime of the object; it is usually best to document this in comments as well. STL adapters such as bind2nd and mem_fun do not permit reference parameters, so you must declare functions with pointer parameters in these cases, too. Function Overloading link ▶Use overloaded functions (including constructors) only if a reader looking at a call site can get a good idea of what is happening without having to first figure out exactly which overload is being called. Definition: You may write a function that takes a const string& and overload it with another that takes const char*. class MyClass { public: void Analyze(const string &text); void Analyze(const char *text, size_t textlen); }; Pros: Overloading can make code more intuitive by allowing an identically-named function to take different arguments. It may be necessary for templatized code, and it can be convenient for Visitors. Cons: If a function is overloaded by the argument types alone, a reader may have to understand C++'s complex matching rules in order to tell what's going on. Also many people are confused by the semantics of inheritance if a derived class overrides only some of the variants of a function. Decision: If you want to overload a function, consider qualifying the name with some information about the arguments, e.g., AppendString(), AppendInt() rather than just Append(). Default Arguments link ▶We do not allow default function parameters, except in a few uncommon situations explained below. Pros: Often you have a function that uses lots of default values, but occasionally you want to override the defaults. Default parameters allow an easy way to do this without having to define many functions for the rare exceptions. Cons: People often figure out how to use an API by looking at existing code that uses it. Default parameters are more difficult to maintain because copy-and-paste from previous code may not reveal all the parameters. Copy-and-pasting of code segments can cause major problems when the default arguments are not appropriate for the new code. Decision: Except as described below, we require all arguments to be explicitly specified, to force programmers to consider the API and the values they are passing for each argument rather than silently accepting defaults they may not be aware of. One specific exception is when default arguments are used to simulate variable-length argument lists. // Support up to 4 params by using a default empty AlphaNum. string StrCat(const AlphaNum &a, const AlphaNum &b = gEmptyAlphaNum, const AlphaNum &c = gEmptyAlphaNum, const AlphaNum &d = gEmptyAlphaNum); Variable-Length Arrays and alloca() link ▶We do not allow variable-length arrays or alloca(). Pros: Variable-length arrays have natural-looking syntax. Both variable-length arrays and alloca() are very efficient. Cons: Variable-length arrays and alloca are not part of Standard C++. More importantly, they allocate a data-dependent amount of stack space that can trigger difficult-to-find memory overwriting bugs: "It ran fine on my machine, but dies mysteriously in production". Decision: Use a safe allocator instead, such as scoped_ptr/scoped_array. Friends link ▶We allow use of friend classes and functions, within reason. Friends should usually be defined in the same file so that the reader does not have to look in another file to find uses of the private members of a class. A common use of friend is to have a FooBuilder class be a friend of Foo so that it can construct the inner state of Foo correctly, without exposing this state to the world. In some cases it may be useful to make a unittest class a friend of the class it tests. Friends extend, but do not break, the encapsulation boundary of a class. In some cases this is better than making a member public when you want to give only one other class access to it. However, most classes should interact with other classes solely through their public members. Exceptions link ▶We do not use C++ exceptions. Pros: Exceptions allow higher levels of an application to decide how to handle "can't happen" failures in deeply nested functions, without the obscuring and error-prone bookkeeping of error codes. Exceptions are used by most other modern languages. Using them in C++ would make it more consistent with Python, Java, and the C++ that others are familiar with. Some third-party C++ libraries use exceptions, and turning them off internally makes it harder to integrate with those libraries. Exceptions are the only way for a constructor to fail. We can simulate this with a factory function or an Init() method, but these require heap allocation or a new "invalid" state, respectively. Exceptions are really handy in testing frameworks. Cons: When you add a throw statement to an existing function, you must examine all of its transitive callers. Either they must make at least the basic exception safety guarantee, or they must never catch the exception and be happy with the program terminating as a result. For instance, if f() calls g() calls h(), and h throws an exception that f catches, g has to be careful or it may not clean up properly. More generally, exceptions make the control flow of programs difficult to evaluate by looking at code: functions may return in places you don't expect. This causes maintainability and debugging difficulties. You can minimize this cost via some rules on how and where exceptions can be used, but at the cost of more that a developer needs to know and understand. Exception safety requires both RAII and different coding practices. Lots of supporting machinery is needed to make writing correct exception-safe code easy. Further, to avoid requiring readers to understand the entire call graph, exception-safe code must isolate logic that writes to persistent state into a "commit" phase. This will have both benefits and costs (perhaps where you're forced to obfuscate code to isolate the commit). Allowing exceptions would force us to always pay those costs even when they're not worth it. Turning on exceptions adds data to each binary produced, increasing compile time (probably slightly) and possibly increasing address space pressure. The availability of exceptions may encourage developers to throw them when they are not appropriate or recover from them when it's not safe to do so. For example, invalid user input should not cause exceptions to be thrown. We would need to make the style guide even longer to document these restrictions! Decision: On their face, the benefits of using exceptions outweigh the costs, especially in new projects. However, for existing code, the introduction of exceptions has implications on all dependent code. If exceptions can be propagated beyond a new project, it also becomes problematic to integrate the new project into existing exception-free code. Because most existing C++ code at Google is not prepared to deal with exceptions, it is comparatively difficult to adopt new code that generates exceptions. Given that Google's existing code is not exception-tolerant, the costs of using exceptions are somewhat greater than the costs in a new project. The conversion process would be slow and error-prone. We don't believe that the available alternatives to exceptions, such as error codes and assertions, introduce a significant burden. Our advice against using exceptions is not predicated on philosophical or moral grounds, but practical ones. Because we'd like to use our open-source projects at Google and it's difficult to do so if those projects use exceptions, we need to advise against exceptions in Google open-source projects as well. Things would probably be different if we had to do it all over again from scratch. There is an exception to this rule (no pun intended) for Windows code. Run-Time Type Information (RTTI) link ▶We do not use Run Time Type Information (RTTI). Definition: RTTI allows a programmer to query the C++ class of an object at run time. Pros: It is useful in some unittests. For example, it is useful in tests of factory classes where the test has to verify that a newly created object has the expected dynamic type. In rare circumstances, it is useful even outside of tests. Cons: A query of type during run-time typically means a design problem. If you need to know the type of an object at runtime, that is often an indication that you should reconsider the design of your class. Decision: Do not use RTTI, except in unittests. If you find yourself in need of writing code that behaves differently based on the class of an object, consider one of the alternatives to querying the type. Virtual methods are the preferred way of executing different code paths depending on a specific subclass type. This puts the work within the object itself. If the work belongs outside the object and instead in some processing code, consider a double-dispatch solution, such as the Visitor design pattern. This allows a facility outside the object itself to determine the type of class using the built-in type system. If you think you truly cannot use those ideas, you may use RTTI. But think twice about it. :-) Then think twice again. Do not hand-implement an RTTI-like workaround. The arguments against RTTI apply just as much to workarounds like class hierarchies with type tags. Casting link ▶Use C++ casts like static_cast(). Do not use other cast formats like int y = (int)x; or int y = int(x);. Definition: C++ introduced a different cast system from C that distinguishes the types of cast operations. Pros: The problem with C casts is the ambiguity of the operation; sometimes you are doing a conversion (e.g., (int)3.5) and sometimes you are doing a cast (e.g., (int)"hello"); C++ casts avoid this. Additionally C++ casts are more visible when searching for them. Cons: The syntax is nasty. Decision: Do not use C-style casts. Instead, use these C++-style casts. Use static_cast as the equivalent of a C-style cast that does value conversion, or when you need to explicitly up-cast a pointer from a class to its superclass. Use const_cast to remove the const qualifier (see const). Use reinterpret_cast to do unsafe conversions of pointer types to and from integer and other pointer types. Use this only if you know what you are doing and you understand the aliasing issues. Do not use dynamic_cast except in test code. If you need to know type information at runtime in this way outside of a unittest, you probably have a design flaw. Streams link ▶Use streams only for logging. Definition: Streams are a replacement for printf() and scanf(). Pros: With streams, you do not need to know the type of the object you are printing. You do not have problems with format strings not matching the argument list. (Though with gcc, you do not have that problem with printf either.) Streams have automatic constructors and destructors that open and close the relevant files. Cons: Streams make it difficult to do functionality like pread(). Some formatting (particularly the common format string idiom %.*s) is difficult if not impossible to do efficiently using streams without using printf-like hacks. Streams do not support operator reordering (the %1s directive), which is helpful for internationalization. Decision: Do not use streams, except where required by a logging interface. Use printf-like routines instead. There are various pros and cons to using streams, but in this case, as in many other cases, consistency trumps the debate. Do not use streams in your code. Extended Discussion There has been debate on this issue, so this explains the reasoning in greater depth. Recall the Only One Way guiding principle: we want to make sure that whenever we do a certain type of I/O, the code looks the same in all those places. Because of this, we do not want to allow users to decide between using streams or using printf plus Read/Write/etc. Instead, we should settle on one or the other. We made an exception for logging because it is a pretty specialized application, and for historical reasons. Proponents of streams have argued that streams are the obvious choice of the two, but the issue is not actually so clear. For every advantage of streams they point out, there is an equivalent disadvantage. The biggest advantage is that you do not need to know the type of the object to be printing. This is a fair point. But, there is a downside: you can easily use the wrong type, and the compiler will not warn you. It is easy to make this kind of mistake without knowing when using streams. cout << this; // Prints the address cout << *this; // Prints the contents The compiler does not generate an error because << has been overloaded. We discourage overloading for just this reason. Some say printf formatting is ugly and hard to read, but streams are often no better. Consider the following two fragments, both with the same typo. Which is easier to discover? cerr << "Error connecting to '" hostname.first << ":" hostname.second << ": " hostname.first, foo->bar()->hostname.second, strerror(errno)); And so on and so forth for any issue you might bring up. (You could argue, "Things would be better with the right wrappers," but if it is true for one scheme, is it not also true for the other? Also, remember the goal is to make the language smaller, not add yet more machinery that someone has to learn.) Either path would yield different advantages and disadvantages, and there is not a clearly superior solution. The simplicity doctrine mandates we settle on one of them though, and the majority decision was on printf + read/write. Preincrement and Predecrement link ▶Use prefix form (++i) of the increment and decrement operators with iterators and other template objects. Definition: When a variable is incremented (++i or i++) or decremented (--i or i--) and the value of the expression is not used, one must decide whether to preincrement (decrement) or postincrement (decrement). Pros: When the return value is ignored, the "pre" form (++i) is never less efficient than the "post" form (i++), and is often more efficient. This is because post-increment (or decrement) requires a copy of i to be made, which is the value of the expression. If i is an iterator or other non-scalar type, copying i could be expensive. Since the two types of increment behave the same when the value is ignored, why not just always pre-increment? Cons: The tradition developed, in C, of using post-increment when the expression value is not used, especially in for loops. Some find post-increment easier to read, since the "subject" (i) precedes the "verb" (++), just like in English. Decision: For simple scalar (non-object) values there is no reason to prefer one form and we allow either. For iterators and other template types, use pre-increment. Use of const link ▶We strongly recommend that you use const whenever it makes sense to do so. Definition: Declared variables and parameters can be preceded by the keyword const to indicate the variables are not changed (e.g., const int foo). Class functions can have the const qualifier to indicate the function does not change the state of the class member variables (e.g., class Foo { int Bar(char c) const; };). Pros: Easier for people to understand how variables are being used. Allows the compiler to do better type checking, and, conceivably, generate better code. Helps people convince themselves of program correctness because they know the functions they call are limited in how they can modify your variables. Helps people know what functions are safe to use without locks in multi-threaded programs. Cons: const is viral: if you pass a const variable to a function, that function must have const in its prototype (or the variable will need a const_cast). This can be a particular problem when calling library functions. Decision: const variables, data members, methods and arguments add a level of compile-time type checking; it is better to detect errors as soon as possible. Therefore we strongly recommend that you use const whenever it makes sense to do so: If a function does not modify an argument passed by reference or by pointer, that argument should be const. Declare methods to be const whenever possible. Accessors should almost always be const. Other methods should be const if they do not modify any data members, do not call any non-const methods, and do not return a non-const pointer or non-const reference to a data member. Consider making data members const whenever they do not need to be modified after construction. However, do not go crazy with const. Something like const int * const * const x; is likely overkill, even if it accurately describes how const x is. Focus on what's really useful to know: in this case, const int** x is probably sufficient. The mutable keyword is allowed but is unsafe when used with threads, so thread safety should be carefully considered first. Where to put the const Some people favor the form int const *foo to const int* foo. They argue that this is more readable because it's more consistent: it keeps the rule that const always follows the object it's describing. However, this consistency argument doesn't apply in this case, because the "don't go crazy" dictum eliminates most of the uses you'd have to be consistent with. Putting the const first is arguably more readable, since it follows English in putting the "adjective" (const) before the "noun" (int). That said, while we encourage putting const first, we do not require it. But be consistent with the code around you! Integer Types link ▶Of the built-in C++ integer types, the only one used is int. If a program needs a variable of a different size, use a precise-width integer type from , such as int16_t. Definition: C++ does not specify the sizes of its integer types. Typically people assume that short is 16 bits, int is 32 bits, long is 32 bits and long long is 64 bits. Pros: Uniformity of declaration. Cons: The sizes of integral types in C++ can vary based on compiler and architecture. Decision: defines types like int16_t, uint32_t, int64_t, etc. You should always use those in preference to short, unsigned long long and the like, when you need a guarantee on the size of an integer. Of the C integer types, only int should be used. When appropriate, you are welcome to use standard types like size_t and ptrdiff_t. We use int very often, for integers we know are not going to be too big, e.g., loop counters. Use plain old int for such things. You should assume that an int is at least 32 bits, but don't assume that it has more than 32 bits. If you need a 64-bit integer type, use int64_t or uint64_t. For integers we know can be "big", use int64_t. You should not use the unsigned integer types such as uint32_t, unless the quantity you are representing is really a bit pattern rather than a number, or unless you need defined twos-complement overflow. In particular, do not use unsigned types to say a number will never be negative. Instead, use assertions for this. On Unsigned Integers Some people, including some textbook authors, recommend using unsigned types to represent numbers that are never negative. This is intended as a form of self-documentation. However, in C, the advantages of such documentation are outweighed by the real bugs it can introduce. Consider: for (unsigned int i = foo.Length()-1; i >= 0; --i) ... This code will never terminate! Sometimes gcc will notice this bug and warn you, but often it will not. Equally bad bugs can occur when comparing signed and unsigned variables. Basically, C's type-promotion scheme causes unsigned types to behave differently than one might expect. So, document that a variable is non-negative using assertions. Don't use an unsigned type. 64-bit Portability link ▶Code should be 64-bit and 32-bit friendly. Bear in mind problems of printing, comparisons, and structure alignment. printf() specifiers for some types are not cleanly portable between 32-bit and 64-bit systems. C99 defines some portable format specifiers. Unfortunately, MSVC 7.1 does not understand some of these specifiers and the standard is missing a few, so we have to define our own ugly versions in some cases (in the style of the standard include file inttypes.h): // printf macros for size_t, in the style of inttypes.h #ifdef _LP64 #define __PRIS_PREFIX "z" #else #define __PRIS_PREFIX #endif // Use these macros after a % in a printf format string // to get correct 32/64 bit behavior, like this: // size_t size = records.size(); // printf("%"PRIuS"\n", size); #define PRIdS __PRIS_PREFIX "d" #define PRIxS __PRIS_PREFIX "x" #define PRIuS __PRIS_PREFIX "u" #define PRIXS __PRIS_PREFIX "X" #define PRIoS __PRIS_PREFIX "o" Type DO NOT use DO use Notes void * (or any pointer) %lx %p int64_t %qd, %lld %"PRId64" uint64_t %qu, %llu, %llx %"PRIu64", %"PRIx64" size_t %u %"PRIuS", %"PRIxS" C99 specifies %zu ptrdiff_t %d %"PRIdS" C99 specifies %zd Note that the PRI* macros expand to independent strings which are concatenated by the compiler. Hence if you are using a non-constant format string, you need to insert the value of the macro into the format, rather than the name. It is still possible, as usual, to include length specifiers, etc., after the % when using the PRI* macros. So, e.g. printf("x = %30"PRIuS"\n", x) would expand on 32-bit Linux to printf("x = %30" "u" "\n", x), which the compiler will treat as printf("x = %30u\n", x). Remember that sizeof(void *) != sizeof(int). Use intptr_t if you want a pointer-sized integer. You may need to be careful with structure alignments, particularly for structures being stored on disk. Any class/structure with a int64_t/uint64_t member will by default end up being 8-byte aligned on a 64-bit system. If you have such structures being shared on disk between 32-bit and 64-bit code, you will need to ensure that they are packed the same on both architectures. Most compilers offer a way to alter structure alignment. For gcc, you can use __attribute__((packed)). MSVC offers #pragma pack() and __declspec(align()). Use the LL or ULL suffixes a

3,242

社区成员

4,604

社区内容

发帖

与我相关

我的任务

社区管理员

加入社区

近7日
近30日
至今

加载中

查看更多榜单

社区公告

暂无公告

试试用AI创作助手写篇文章吧

+ 用AI写文章