请教问题:大文件处理(C语言)

imGala 2013-12-04 10:50:04

问题:
C语言
需要处理N个大文件,size有可能很大,但是大小上限不能具体知道。
需要并行处理文件。
文件的内容有可能涉及到解码什么的。

我一开始用了2个方法:
<1>先声明一个超大数组,一次性读完文件。问题是太浪费内存,因为文件有可能是很小的。
<2>用malloc动态分配一个文件大小的space。问题是频繁的动态分配,会造成过多的内存碎片,不适合长时间的运行。
C++好像有内部内存管理的类string,但是C++不太拿出手。。。。
哪位大神帮忙想想办法。
...全文
270 4 打赏 收藏 转发到动态 举报
写回复
用AI写文章
4 条回复
切换为时间正序
请发表友善的回复…
发表回复
imGala 2013-12-04
  • 打赏
  • 举报
回复
嗯,谢谢,我想象具体肿么整。
版主大哥 2013-12-04
  • 打赏
  • 举报
回复
怕内存碎片就用一个内存池 看你的需求 我觉得是否对文件固定读取大小,逻辑处理,然后在读...如果返回大小=<0,就说明读完了. 不知道文件中的数据是否可以这切割处理. 你说用string那些东西,stl内部也是用了线程池的
图灵狗 2013-12-04
  • 打赏
  • 举报
回复
可用double buffer,开两个比如4KB的buffer,一个用于读取数据,一个用于解析,完了把两个buffer交换一下,如此循环。
引用 楼主 xnini632d 的回复:
问题: C语言 需要处理N个大文件,size有可能很大,但是大小上限不能具体知道。 需要并行处理文件。 文件的内容有可能涉及到解码什么的。 我一开始用了2个方法: <1>先声明一个超大数组,一次性读完文件。问题是太浪费内存,因为文件有可能是很小的。 <2>用malloc动态分配一个文件大小的space。问题是频繁的动态分配,会造成过多的内存碎片,不适合长时间的运行。 C++好像有内部内存管理的类string,但是C++不太拿出手。。。。 哪位大神帮忙想想办法。
赵4老师 2013-12-04
  • 打赏
  • 举报
回复
CreateFileMapping The CreateFileMapping function creates a named or unnamed file-mapping object for the specified file. HANDLE CreateFileMapping( HANDLE hFile, // handle to file to map LPSECURITY_ATTRIBUTES lpFileMappingAttributes, // optional security attributes DWORD flProtect, // protection for mapping object DWORD dwMaximumSizeHigh, // high-order 32 bits of object size DWORD dwMaximumSizeLow, // low-order 32 bits of object size LPCTSTR lpName // name of file-mapping object ); Parameters hFile Handle to the file from which to create a mapping object. The file must be opened with an access mode compatible with the protection flags specified by the flProtect parameter. It is recommended, though not required, that files you intend to map be opened for exclusive access. If hFile is (HANDLE)0xFFFFFFFF, the calling process must also specify a mapping object size in the dwMaximumSizeHigh and dwMaximumSizeLow parameters. The function creates a file-mapping object of the specified size backed by the operating-system paging file rather than by a named file in the file system. The file-mapping object can be shared through duplication, through inheritance, or by name. lpFileMappingAttributes Pointer to a SECURITY_ATTRIBUTES structure that determines whether the returned handle can be inherited by child processes. If lpFileMappingAttributes is NULL, the handle cannot be inherited. Windows NT: The lpSecurityDescriptor member of the structure specifies a security descriptor for the new file-mapping object. If lpFileMappingAttributes is NULL, the file-mapping object gets a default security descriptor. flProtect Specifies the protection desired for the file view, when the file is mapped. This parameter can be one of the following values: Value Description PAGE_READONLY Gives read-only access to the committed region of pages. An attempt to write to or execute the committed region results in an access violation. The file specified by the hFile parameter must have been created with GENERIC_READ access. PAGE_READWRITE Gives read-write access to the committed region of pages. The file specified by hFile must have been created with GENERIC_READ and GENERIC_WRITE access. PAGE_WRITECOPY Gives copy on write access to the committed region of pages. The files specified by the hFile parameter must have been created with GENERIC_READ and GENERIC_WRITE access. In addition, an application can specify certain section attributes by combining (using the bitwise OR operator) one or more of the following section attribute values with one of the preceding page protection values: Value Description SEC_COMMIT Allocates physical storage in memory or in the paging file on disk for all pages of a section. This is the default setting. SEC_IMAGE The file specified for a section's file mapping is an executable image file. Because the mapping information and file protection are taken from the image file, no other attributes are valid with SEC_IMAGE. SEC_NOCACHE All pages of a section are to be set as non-cacheable. This attribute is intended for architectures requiring various locking structures to be in memory that is never fetched into the processor's. On 80x86 and MIPS machines, using the cache for these structures only slows down the performance as the hardware keeps the caches coherent. Some device drivers require noncached data so that programs can write through to the physical memory. SEC_NOCACHE requires either the SEC_RESERVE or SEC_COMMIT to also be set. SEC_RESERVE Reserves all pages of a section without allocating physical storage. The reserved range of pages cannot be used by any other allocation operations until it is released. Reserved pages can be committed in subsequent calls to the VirtualAlloc function. This attribute is valid only if the hFile parameter is (HANDLE)0xFFFFFFFF; that is, a file mapping object backed by the operating sytem paging file. dwMaximumSizeHigh Specifies the high-order 32 bits of the maximum size of the file-mapping object. dwMaximumSizeLow Specifies the low-order 32 bits of the maximum size of the file-mapping object. If this parameter and dwMaximumSizeHig are zero, the maximum size of the file-mapping object is equal to the current size of the file identified by hFile. lpName Pointer to a null-terminated string specifying the name of the mapping object. The name can contain any character except the backslash character (\). If this parameter matches the name of an existing named mapping object, the function requests access to the mapping object with the protection specified by flProtect. If this parameter is NULL, the mapping object is created without a name. If lpName matches the name of an existing event, semaphore, mutex, waitable timer, or job, the function fails and the GetLastError function returns ERROR_INVALID_HANDLE. This occurs because these objects share the same name space. Return Values If the function succeeds, the return value is a handle to the file-mapping object. If the object existed before the function call, the function returns a handle to the existing object (with its current size, not the specified size) and GetLastError returns ERROR_ALREADY_EXISTS. If the function fails, the return value is NULL. To get extended error information, call GetLastError. Remarks After a file-mapping object has been created, the size of the file must not exceed the size of the file-mapping object; if it does, not all of the file's contents will be available for sharing. If an application specifies a size for the file-mapping object that is larger than the size of the actual named file on disk, the file on disk is grown to match the specified size of the file-mapping object. If the file cannot be grown, this results in a failure to create the file-mapping object. GetLastError will return ERROR_DISK_FULL. The handle that CreateFileMapping returns has full access to the new file-mapping object. It can be used with any function that requires a handle to a file-mapping object. File-mapping objects can be shared either through process creation, through handle duplication, or by name. For information on duplicating handles, see DuplicateHandle. For information on opening a file-mapping object by name, see OpenFileMapping. Windows 95: File handles that have been used to create file-mapping objects must not be used in subsequent calls to file I/O functions, such as ReadFile and WriteFile. In general, if a file handle has been used in a successful call to the CreateFileMapping function, do not use that handle unless you first close the corresponding file-mapping object. Creating a file-mapping object creates the potential for mapping a view of the file but does not map the view. The MapViewOfFile and MapViewOfFileEx functions map a view of a file into a process's address space. With one important exception, file views derived from a single file-mapping object are coherent, or identical, at a given time. If multiple processes have handles of the same file-mapping object, they see a coherent view of the data when they map a view of the file. The exception has to do with remote files. Although CreateFileMapping works with remote files, it does not keep them coherent. For example, if two computers both map a file as writeable, and both change the same page, each computer will only see its own writes to the page. When the data gets updated on the disk, it is not merged. A mapped file and a file accessed by means of the input and output (I/O) functions (ReadFile and WriteFile) are not necessarily coherent. To fully close a file mapping object, an application must unmap all mapped views of the file mapping object by calling UnmapViewOfFile, and close the file mapping object handle by calling CloseHandle. The order in which these functions are called does not matter. The call to UnmapViewOfFile is necessary because mapped views of a file mapping object maintain internal open handles to the object, and a file mapping object will not close until all open handles to it are closed. Note To guard against an access violation, use structured exception handling to protect any code that writes to or reads from a memory mapped view. For more information, see Reading and Writing. Windows CE: Windows CE does not use the lpFileMappingAttributes parameter. It must be NULL. This function will not work on a device that does not support Page-In. Example To implement a mapping-object creation function that fails if the object already exists, an application can use the following code. hMap = CreateFileMapping(...); if (hMap != NULL && GetLastError() == ERROR_ALREADY_EXISTS) { CloseHandle(hMap); hMap = NULL; } return hMap; QuickInfo Windows NT: Requires version 3.1 or later. Windows: Requires Windows 95 or later. Windows CE: Requires version 1.0 or later. Header: Declared in winbase.h. Import Library: Use kernel32.lib. Unicode: Implemented as Unicode and ANSI versions on Windows NT. See Also File Mapping Overview, File Mapping Functions, CloseHandle, DuplicateHandle, MapViewOfFile, MapViewOfFileEx, MapViewOfFileVlm, OpenFileMapping, ReadFile, SECURITY_ATTRIBUTES, UnmapViewOfFile, UnmapViewOfFileVlm, VirtualAlloc, WriteFile

69,382

社区成员

发帖
与我相关
我的任务
社区描述
C语言相关问题讨论
社区管理员
  • C语言
  • 花神庙码农
  • 架构师李肯
加入社区
  • 近7日
  • 近30日
  • 至今
社区公告
暂无公告

试试用AI创作助手写篇文章吧